Multiserver DoOO Data Management

This post can only possibly appeal to about 12 people and only when they’re really in the mood for weedsy code stuff. However, that’s about my normal readership, so here we go…

For the OU Create project, we now have 5 servers that are managed by Reclaim Hosting. We’ve got more than 4000 users, and, collectively, they have more than 5000 websites. Keeping track of all of the usernames, domains, and apps in use is difficult.

One of the ways that we study the digital ecology of these servers is by looking at the data.db files created by each server’s instance of Installatron. These database files keep track of all of the web apps installed using Installatron. Thus we have a record of every installation of WordPress, Drupal, Omeka, MediaWiki, or any of the other 140 or so apps with Installatron one-click installation packages. I can find the user’s name, the date, and the domain, subdomain, or subdirectory used for each installation. However, within the data.db file, there are five tables for all of this data and it’s a SQLite file, so it’s not exactly a quick or easy read. Further complicating everything is that each server has it’s own data.db file and each one is buried deep in the directory structure amongst the tens of thousands of files on the server.

Here at OU, we have a couple of websites that were built as derivatives of studying the data.db files. The first was community.oucreate.com/activity.

This site is built on Feed WordPress. We feed it the urls for each WordPress site in OU Create and it aggregates the blog activity into one feed averaging 300+ posts a week.

The other site is community.oucreate.com/sites.

Screen Shot 2017-12-06 of community.oucreate.com/sites

Sites provides a filterable set of cards displaying screen captures, links, and app metadata for all of the sites in OU Create. You can see what sites are running Omeka or Vanilla Forums or whatever other app you’d like.

To maintain these sites, I would normally go into each server and navigate up one level from the default to the server root, then go into the var directory, then the installatron directory, and then download the data.db file. Once I’ve downloaded it, I use some software to open the file, and then export the i_installs table to a csv. Then I find the installations since my last update of the system, copy the urls, and paste them into the Activity site or run a script against them for the Sites site. I repeat this for each server. This process is terrible and I hate doing it, so it doesn’t get done as often as it should.

This week, I wrote some code to do these updates for me. At the most basic level, the code uses secure shell (SSH) to login to a server and download a desired file. My version of the code loops (repeats) for each of my five servers downloading the data.db file and storing them all in one folder. Here is the code and below I’ll explain how I got here and why:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
import os
import paramiko
import sqlite3
import csv
 
#The keypasses list holds the passwords for the keys to each of the servers
keypasses=["xxxxxxxxxx", "xxxxxxxxxx", "xxxxxxxxxx", "xxxxxxxxxx", "xxxxxxxxxx"]
counter = 1
csvWriter = csv.writer(open("output.csv", "w"))
 
#loop through the keypass list getting the data.db files
for keypass in keypasses:
 
db = "data%s.db" % counter
servername = "oklahoma%s.reclaimhosting.com" % counter
keyloc="/Users/johnstewart/Desktop/CreateUserLogs/id_oklahoma%s" % counter
k = paramiko.RSAKey.from_private_key_file(keyloc, password=keypass)
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
print ("oklahoma%s connecting" % counter)
ssh.connect(servername, username="root", pkey = k)
print ("connected")
sftp = ssh.open_sftp()
localpath = '/Users/johnstewart/Desktop/CreateUserLogs/data%s.db' % counter
remotepath = '../var/installatron/data.db'
sftp.get(remotepath,localpath)
sftp.close()
print ("data%s.db downloaded" % counter)
ssh.close()
#export the desired table from the database to the aggregated csv
with sqlite3.connect(db) as connection:
c = connection.cursor()
c.execute("SELECT * FROM i_users")
rows = c.fetchall()
csvWriter.writerows(rows)
print ("%s i_users exported to csv" % db)
counter=counter+1

 

On Monday, I wrote the first draft of this in bash and then rewrote it in a similar language called expect. With expect, I could ssh into a server and then respond to the various login prompts with the relevant usernames and passwords. However, this exposed the passwords to Man in the Middle attacks. If someone where listening to the traffic between my computer and the server, they would be able to intercept the username and password from within the file. This is obviously not the best way to do things.

The solution was to use an ssh key. These keys are saved to your local computer and provide an encrypted code to login to the server. You in turn use a password to activate the key on your own computer, so there’s no ‘man in the middle.’ Unfortunately, I had no idea how to do this. Luckily for me, Tim Owens is a fantastical web host and has a video explaining how to set up keys on Reclaim accounts:

I set up keys for each of the servers and saved them into my project folder. This also denecessitated the ‘expect’ script because I no longer needed to enter a password for each server.

I turned back to a bash shell script, but couldn’t figure out what to do with my .db files once I had downloaded them all. This morning I turned from bash to python which is very good at handling data files. Python also has the paramiko library, which simplifies the process for logging into and downloading files from servers. You can see in the loop part of the code above where I call several paramiko functions to establish a connection to the server and then use sftp to get the file I want.

Our servers are labeled numerically oklahoma1 through oklahoma5, and I had saved my keys for each server as id_oklahoma1 through id_oklahoma5, so it was easy to repeat the basic procedure for each server by writing a loop that repeated 5 times. Each time the loop occurs it just increases the number associated with the servers and keys. The loop also saves the data.db files locally as data1, data2, etc.

The last step was to use Python to compile the desired data from each of these data.db files. SQLite3 provided the needed methods for handling data files. I could connect to each database after I downloaded them. Then I called the table that I wanted from each table and “fetched” all of the rows from that table. From there, I can use the csv library to write those rows to a csv (an excel like, comma separated variable table). This whole process was part of the larger programmatic loop, so each time I pulled a database from a server, I was adding it’s table rows to the collective csv.

For me, this process will make it easy to pull a daily update of all of the domains in OU Create and then upload those to my two community websites. As we follow Tom Woodward and Marie Selvanadin’s work on the Georgetown Community site, these up-to-date lists of sites will make it easier to build sites that pull together information from the API’s of the many OU Create websites. The process could also be generalized to pull or post any files from a set of similar servers allowing for quicker maintenance and analysis projects. Hopefully Lauren, Jim, and Tim will find fun applications as they continue their Reclaim Hosting server tech trainings.

Omeka + HTML5Up

I have been working on a couple of Omeka sites this semester. I really like the Omeka system for its handling of meta data and the Neatline map extension for easy map creation. Omeka S just came out and looks fantastic, though I haven’t had a chance to build with it yet.

However, I do not like the Omeka themes. For a while I was using Denison as seen here. However, the theme mishandles the four images on the front page and the drop down menu has a tendency to malfunction.

I also tried Big Picture but abandoned it when I was unable to merge the beautiful Browse Collections page into the index page.

Ultimately I turned away from Omeka themes and instead used HTML5Up templates. I transitioned Dr. Daniela Busiciglio’s site that had been built with Denison from this:

to this (progettovitaliano.org):

Screen Shot 2017-12-04 of progettovitaliano.com

I’ve just finished transitioning another site that I built with Dr. Mirelsie Velazquez and her students from the Big Picture Omeka theme to this (http://docnarr.oucreate.com/):

Because Omeka’s API does not provide images, I was not able to load the images into these HTML based pages programmatically. Thus, these front pages will require a bit more maintenance in their HTML code than Omeka themes, which are controlled from the administrative GUI. However, I really like the flexibility that I get from stripping out the front page of the Omeka themes and replacing it with something that I can tweak to feature the best parts of the projects.

 

The Gods of Norman

In my history of science survey, we talk about how ancient cosmologies reflect the geography of the people. Thus the Nile River civilizations believe in mostly benevolent gods that bring annual, predictable, life-giving floods. The Norse have more vengeful and terrifying gods reflecting their harsher geography. Cosmologies are thus often connected to the lived reality and understandings of nature for a group.

Along these lines, I have the students write a cosmology reflecting their own hometown. They describe how the gods must be based on our observations. One of my students this semester, who goes by the handle xmac342, wrote the following entry. I think it should be used in our town’s marketing material.

The Gods of Norman

Norman, Oklahoma is a very lively town in which many people of various backgrounds come together. This must be in part because of the large school that is placed in the middle of it. No doubt this school was a gift from the Gods, as if it weren’t here, this town would not nearly be as advanced.

There are the Gods of education, love, lust, earth, sky, families, Agriculture, Music/fine arts, Roadways/construction, sleep, revenge/conflict, luck, and sport.

The God of education has not been happy with majority of the people, as they tend to evade the gift of schooling, so he punishes the people by giving them public schooling that is subpar, and making upper-division schooling so expensive that not many people can attend even if they wanted to. They also make sure that the people who continue their lives in the education field aren’t paid as well as in other places where people value the God’s gifts.

The God of love of course gives people the ability to love one another, but this is often mistaken for the gift from the God of lust. The God of earth gives us the green grass and trees that are seen throughout Norman, however, she has been at war with the God of roadways/construction, thus explaining why majority of the grass is dead and why the trees are all small. Also, the roadways and construction are a punishment from the Gods because of people’s continuous desire to have something better than they already do. Because of this, the Gods put endless construction, and when it does happen to be finished in one spot, after months or even years, it is just as annoying as before because of the quality, or confusion that it brings with it.

The God of the sky is also in charge of the wind and rain. He has so far protected Norman from major destruction from his Tornados, but also decides to allow a major drought majority of the time to punish us for our wastefulness of water. When he does allow it to rain though, he creates a flood everywhere, not only to rehydrate the earth, but to make the humans suffer as much as possible.

The god of families creates the different families of course, and works with the gods of love and lust to influence people’s lives. The God of Agriculture allows us to continue our farms and ranches and for the most part allow them to evade disasters caused by other Gods.

The God of music/fine arts allows the people of Norman to express themselves and create a culture. This plays into human identity, however the God also works with the God of lust to establish with artists are the most likely to be lusted over (Guitar players are definitely at the top of the list, while oboe players (really all woodwinds) and other unknown artists are very far near the bottom).

The God of sleep despises college students, band members, and all new parents for reasons unknown. The God of revenge and conflict loves to target high-school, and sometimes middle-school girls, as well as drivers. They tend to find the teenage angst and need to be better than everyone, and the need to get places quickly while believing that everyone around you is incredibly stupid very easy to manipulate. The God of luck is completely unpredictable, and greatly impacts the God of Sport, whose main gift was the University of Oklahoma football team. However, sometimes his gift seems like a punishment until the 3rd quarter, in which he decides that the people have been punished enough.

The Gods of Norman are very unforgiving. However, a little bit of each of them lives in every human, as they all came together to create the humans in the image of themselves. No God is more powerful than the others, and they all work with, or conflict with each other, equally.

Page 1 of 26

Powered by WordPress & Theme by Anders Norén

css.php