Summer Reflections

After a long summer of traveling, tiring work weeks with Cisco, bathing in the summer sun, and getting down to the basics with my project, the time to wrap the summer up and start thinking about the next year has come! This summer was off and on progress with a lot of work coming in my weeks of free time and a slow trickle coming during the busiest weeks. I came into the summer with some basic plans and ideas of what I wanted to do (dynamic visualizations) and worked out the details for accomplishing it as I went. My plan of attack was as follows:

  • Write a query to return all of the information from a specific data set.
  • Write a general purpose query to get the urls to the specific data sets.
  • Create a mash-up of the general query.
  • Create a mash-up that can work for the results of the specific query returned by the general query.
  • Combine all of the mash-ups into an attractive (debatable) and easy to navigate web page that can display what is in a specific triple store.

Starting with the Visualizations I previously produced I turned them into tables to create something a little more general. This would give me a base for displaying the data I gather from my first set of queries. I then worked on getting my data. After investigating the tutorials written by Tim L, I was able to create a couple useful queries based on the information in his tutorials. SELECT ?dataset WHERE { GRAPH ?dataset  { ?dataset void:dataDump ?datadump . } } GROUP BY ?dataset This base query generates a list of all of the urls that I would use in my second query: prefix rdfs:       <http://www.w3.org/2000/01/rdf-schema#&gt; prefix void:       <http://rdfs.org/ns/void#&gt; prefix ov:         <http://open.vocab.org/terms/&gt; prefix conversion: <http://purl.org/twc/vocab/conversion/&gt; SELECT DISTINCT ?eg ?col ?p ?pLabel ?o WHERE { graph <http://logd.tw.rpi.edu/source/data-gov/dataset/2793/version/2010-Dec-17/conversion/enhancement/1/subset/sample&gt; { [] void:exampleResource ?eg . ?eg ?p ?o . optional{ ?p ov:csvCol                       ?col } optional{ ?p rdfs:label                      ?pLabel } optional{ ?p conversion:subjectDiscriminator ?discrim } } } ORDER BY ?eg ?colSimply substituting different results from the first query into the second query would allow me to create a table containing all the data from the data-set as seen below:

Now that I had all of the raw results it was time to figure out a way to link everything together and create a “super-mash-up” containing everything and additionally think of useful ways to display the data.

I entertained a few different ideas for how to create my final mash-up and got pretty far into a couple before scrapping them and deciding that another idea was superior. At first I thought that perhaps sending the data off to a perl cgi script would be a good idea and streamline the process but after encountering trouble with writing data I decided against it in favor of a pure JS implementation. I found a way to creatively generate the queries that I would send to Google Viz by noticing the url generated from running queries on the SPARQL proxy hosted by the LOGD site. With the queries ready and the mash-ups good to go it was time to link everything together. The original JS implementation simply printed out all of the visualizations and left a lot to be desired in terms of useability. Since an important part of this project is to make the data useable, I took a while to think of a reasonable way to display the data and still have much work to do, but currently have a pretty good demo of the potential that the script can offer.

 

You can visit the real thing at: http://www.cs.rpi.edu/%7Ehelmc2/TWC/TWCMainDev.html I have enjoyed working on this project and would greatly enjoy continuing to do so and perhaps helping out with some other projects going on in the TWC. I think the progress and work I completed this summer is a great step towards my goal of creating a tool that can help give researchers, especially undergrads, an idea of what data is out there that they can get their hands into and start doing more with! Overall I think this summer was highly successful and I was able to learn a lot about creating visualizations and using SPARQL. I’m exciting to keep making progress and see what everyone else is up to!

Advertisements

About Cameron

Cameron Helm is a third year undergraduate student at Rensselaer Polytechnic Institute currently pursuing his bachelors in Computer Science and a member of the TWC.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s