Latest Version

Instead of posting a bunch of code to the blog to show the updates/changes I thought it might be easier to just post a couple links you can grab the latest* code from. This will take you to a folder where you can view the related files.

To run the scripts you simply need to visit http://www.cs.rpi.edu/~helmc2/TWC/TWC.html. Currently (because I can’t write to the drive)  the script opens a pop-up containing each visualization. This will probably run you out of ram or freeze your computer so make sure you are blocking pop-ups and then click the little link at the top to browse through the pop-ups and show one at a time. I wanted to test out that everything was working and publish whats happening so others can check it out but will definitely change things so that it doesn’t open up 100+ windows. Ideally I will be able to provide a table of clickable links to the visualizations and hope to have something like that soon.

Posted in Uncategorized | Leave a comment

ALL THE QUERIES!

So after playing around with many designs here is where I currently stand:

Break up the work between two files, one to handle the initial work (get the first set of links) and one to call from that file with a parameter being the url that I need for the second query. The second file, a perl cgi script, then takes the url and formats it into a query url to then be used to generate the second visualization. The only remaining work in the second file is to further parse the results of the query and re-organize them for readability.


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8"/>
<title>Google Visualization</title>


<script type="text/javascript" src="http://www.google.com/jsapi"></script>
<script type="text/javascript">

google.load('visualization', '1', {packages: ['table']});

function drawTable() {
//load data using SPARQL query
var queryLoc = 'http://www.cs.rpi.edu/~helmc2/query';
var query = new google.visualization.Query('http://logd.tw.rpi.edu/sparql.php?'+
'query-uri=' + queryLoc +
'&output=gvds');
// Send the query with a callback function.
query.send(handleQueryResponse);
}


function openCGI(p){
var param="http://www.cs.rpi.edu/~helmc2/cgi-bin/TWC.cgi?param1="+p;
window.open(param,p,'width=400,height=200');
}


function handleQueryResponse(response) {
if (response.isError()) {
alert('Error in query: ' +
response.getMessage() + ' ' +
response.getDetailedMessage());

return;

}
//Get the data
var data = response.getDataTable();
var rows = data.getNumberOfRows();
for (var i = 0; i < rows; i++ ){
openCGI(encodeURIComponent(data.getValue(i, 0)));
}


var viz = document.getElementById('my_visualization_DIV');
new google.visualization.Table(viz).draw(data);
}

google.setOnLoadCallback(drawTable);
</script>
</head>
<body>
<div id='my_visualization_DIV'>Loading Results...</div>
</body>
</html>

As you can see there isn’t much change in this file from a typical vis except before actually creating the table there is a for loop that calls openCGI() on the value in each row. This calls the second script passing the result of the first query as a param so I can use it to generate the second sparql query. I chose to do this because I couldn’t directly write the results anywhere using js and I thought this was more practical than trying to create every visualization at once.

#!/usr/bin/env perl
use strict;
use warnings;

use CGI qw/:standard/;
use CGI::Carp qw/fatalsToBrowser warningsToBrowser/;
use Encode;

print header;

warningsToBrowser(1);

print start_html;
my $link = param('param1');

my $query = "http://logd.tw.rpi.edu/sparql.php?query-option=text&amp;query=prefix+rdfs%3A+++++++%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0D%0Aprefix+void%3A+++++++%3Chttp%3A%2F%2Frdfs.org%2Fns%2Fvoid%23%3E%0D%0Aprefix+ov%3A+++++++++%3Chttp%3A%2F%2Fopen.vocab.org%2Fterms%2F%3E%0D%0Aprefix+conversion%3A+%3Chttp%3A%2F%2Fpurl.org%2Ftwc%2Fvocab%2Fconversion%2F%3E%0D%0A%0D%0ASELECT+DISTINCT+%3Feg+%3Fcol+%3Fp+%3FpLabel+%3Fo%0D%0AWHERE+{%0D%0Agraph+%3C" . $link . "%3E+{%0D%0A[]+void%3AexampleResource+%3Feg+.%0D%0A%3Feg+%3Fp+%3Fo+.%0D%0Aoptional{+%3Fp+ov%3AcsvCol+++++++++++++++++++++++%3Fcol+}%0D%0Aoptional{+%3Fp+rdfs%3Alabel++++++++++++++++++++++%3FpLabel+}%0D%0Aoptional{+%3Fp+conversion%3AsubjectDiscriminator+%3Fdiscrim+}%0D%0A}%0D%0A}+ORDER+BY+%3Feg+%3Fcol&amp;service-uri=&amp;output=gvds&amp;callback=&amp;tqx=&amp;tp=";

print qq~
&lt;script type="text/javascript" src="http://www.google.com/jsapi"&gt;&lt;/script&gt;
&lt;script type="text/javascript"&gt;

google.load('visualization', '1', {packages: ['table']});

function drawTable() {
//load data using SPARQL query
var queryLoc = 'http://www.cs.rpi.edu/\~helmc2/query';
var query = new google.visualization.Query('~;print $query; print qq~');
// Send the query with a callback function.
query.send(handleQueryResponse);
}

function handleQueryResponse(response) {
if (response.isError()) {
alert('Error in query: ' +
response.getMessage() + ' ' +
response.getDetailedMessage());
return;
}
var data = response.getDataTable();
var viz = document.getElementById('my_visualization_DIV');
new google.visualization.Table(viz).draw(data);
}
google.setOnLoadCallback(drawTable);
&lt;/script&gt;
~;
print div ({-id=&gt;'my_visualization_DIV'},'Loading Content');
print end_html;

Posted in Uncategorized | Leave a comment

Upcoming Plans

Well the last couple weeks have been very busy and it has been troublesome to get work done here, but I have been making some slow progress. Currently my plan of attack is:

1. Get a script running that takes the links generated from the first query and plugs them into the second query, this is pretty simple and I will only be using a subset of the first query’s results as a proof of concept.

2. Once the script is generating all of the next queries properly I plan on parsing the data tables generated by the queries and putting them into new tables to be visualized using the google viz tool. By re-ordering the data I can make a more logical graph from the original query without having to run another query which would be much more time consuming.

3. Once I have the data ordered and have created the visualization the tool should be ready. I need to plan on where I will store the created visualizations still but that is a relatively minor detail that I will worry about more when I get there.

Hopefully this week will be a little less busy and I can get these ideas completed and have a working visualization tool!

Posted in Uncategorized | Leave a comment

Working on getting some data…

So after getting some responses from both John and Tim, I have been able to work on looking into the different data on the data-set and using some resources created by Tim generate some queries that are returning data!

The query I am using to list the data-sets uses void:dataDump to get the raw rdf dumps from the files. Then, using some of the queries created that can be found on the resources Tim has has up, I was able to query the dataset and get a detailed SPO type graph from the dataset. You can find the tutorials here!

Here is the initial query:

SELECT ?dataset
WHERE {
GRAPH ?dataset  {
?dataset void:dataDump ?datadump .
}
}
GROUP BY ?dataset

After that we get the results:

We can then take one of the results:
<http://logd.tw.rpi.edu/source/data-gov/dataset/2793/version/2010-Dec-17/conversion/enhancement/1/subset/sample&gt;

And plug it into the next query:
prefix rdfs:       <http://www.w3.org/2000/01/rdf-schema#&gt;
prefix void:       <http://rdfs.org/ns/void#&gt;
prefix ov:         <http://open.vocab.org/terms/&gt;
prefix conversion: <http://purl.org/twc/vocab/conversion/&gt;


SELECT DISTINCT ?eg ?col ?p ?pLabel ?o
WHERE {
graph <http://logd.tw.rpi.edu/source/data-gov/dataset/2793/version/2010-Dec-17/conversion/enhancement/1/subset/sample&gt; {
[] void:exampleResource ?eg .
?eg ?p ?o .
optional{ ?p ov:csvCol                       ?col }
optional{ ?p rdfs:label                      ?pLabel }
optional{ ?p conversion:subjectDiscriminator ?discrim }
}
} ORDER BY ?eg ?col

And the following results can be found:

Posted in Uncategorized | Leave a comment

Waiting for some answers…

While I am still waiting to hear back from John I thought I would put a post up on where my current thoughts are and what they will hopefully lead me into. Starting with a basic query like:

PREFIX conversion: <http://purl.org/twc/vocab/conversion/&gt;
SELECT ?dataset
WHERE {
GRAPH ?dataset  {
?dataset void:subset ?subdataset .
}
}
GROUP BY ?dataset

And a resulting output can be found here.

The idea of the query is to return a list of all of the datasets on the triplestore and then be able to take the individual links and query them. So far I am still not sure how to form/reform the links to get access to the data or if there are some changes I will need to make to the query. After I get the query to return what I am looking for it is my hope that I can start with a basic SPO query to then return the triples in each data set and find out what is in there. Once I can get the individual data from the datasets it will be pretty easy to graph it since I have already written a script to do so. Thats about it for now, hopefully I will get a response from John soon and then I can get this next part of getting the individual data going!

Posted in Uncategorized | Leave a comment

A Summer Plan

So it is time to lay out some goals and plans for completing some research objectives during this summer.

My primary objectives are:
~ To learn about and understand the LOGD data sets to answer the following questions;What’s the ontology? How are the graphs set up? How is the triple store set up? What’s a dataset? A subdataset? A subset? etc…
~ To explore and experiment with the power of the google visualization tool.
~ Finally, to combine the knowledge from the previous two goals and create a tool that can query the LOGD triple store and produce visualizations of the data sets within.

Beginning plans towards the completion of these goals start by e-mailing John Erickson to learn the structure of the data-gov triples and understand how to access them. Once I have completed the first objective I should be able to create queries into the triple store and use them to query the specific data sets within the triple store. It is my hope that this results in the production of a useful tool in the discovery of data sets already existing and to aid in the explanation of the structure of the data sets and the triple store. It is my final hope that this project could greatly aid new researchers in understanding and using the triple stores for their own research projects.

Posted in Uncategorized | Leave a comment

Semester Reflections and [Missed] Opportunities

Wow, I can’t believe I am writing my first entry in this blog for the semester at the end. After meeting with Patrick at the beginning of the year I discussed and reflected a little further on the previous semester and told him that while I was taking 5 courses I still wished to participate in the TWC to try to gain more experience, especially since the previous semester had given me a lot of new and interesting experience. We discussed a new plan for the semester and hopefully I would be able to work on my own project – which was to build off of the visualizations that I worked on last semester and hopefully gain enough insight to build a dynamic framework to automatically create some basic visualizations of the date found in the TWC triplestores.

I felt that creating visual representations of the data would allow a few benefits for myself and the TWC. The first benefit would be the ability to easily see what kinds of data are stored on the triplestore. Since there is often confusion about what data is located in a certain triplestore the framework would be able to resolve those issues and further make it easier to choose an appropriate data set. Secondly, the framework could be integrated into the new tv/project advertising system in Winslow to help researchers to easily make quick charts of their data. Lastly, I thought working on this framework would let me experiment further with visualizations which I really enjoyed last semester and thought would be a great area to continue researching and learning.

Towards the beginning of the semester I brought up some of my previous visualizations and reworked the scripts to be as simple as possible and worked on generalizing their format to be as similar as possible since I thought this would make the best foundation for creating additional visualizations.

The following is the code of one of the visualizations of data that I used as a test basis:

<script type="text/javascript">
google.load('visualization', '1', {packages: ['geomap']});
function drawMap() {
// To see the data that this visualization uses, browse to
// http://data-gov.tw.rpi.edu/raw/33/data-33.rdf
var query = new google.visualization.Query('http://data-gov.tw.rpi.edu/ws/sparqlproxy.php?'+
'query-uri=http%3A%2F%2Fdata-gov.tw.rpi.edu%2Fsparql%2Fquakemap.sparql'+
'&output=gvds');
// Send the query with a callback function.
query.send(handleQueryResponse);
}
function handleQueryResponse(response) {
// Check for query errors
if (response.isError()) {
alert('Error in query: ' +
response.getMessage() + ' ' +
response.getDetailedMessage());
return;
}
var data = response.getDataTable();
// Configure map options
var options = {};
options['region'] = 'US';
options['dataMode'] = 'markers';
options['width'] = 450;
options['height'] = 300;


var viz = document.getElementById('my_visualization_DIV');
new google.visualization.GeoMap(viz).draw(data, options );
}
google.setOnLoadCallback(drawMap);


</script>

It is pretty easy to see that there would be a number of issues that I would have to deal with to make this kind of script general enough to plug and play with another query. Breaking the script down into as simple visualization as possible, there are three important parts to consider for developing a general visualization. First, I would need to be able to form a query that can get some relevant data from the data set. I think this will likely be one of the more challenging points to making the visualization because the framework wont know what anything is called. The following primitive query shows how getting specific data could be really difficult:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#&gt;
PREFIX foaf: <http://xmlns.com/foaf/0.1/&gt;
PREFIX tw: <http://tw.rpi.edu/schema/&gt;
PREFIX twi: <http://tw.rpi.edu/instances/&gt;
SELECT DISTINCT ?Subject ?Name ?Loc ?Affiliation
WHERE {
?Subject a foaf:Person.
?Subject foaf:name ?Name.
?Subject tw:hasAffiliation ?Affiliation.
FILTER (?Affiliation = twi:TetherlessWorldConstellation)
?Subject tw:hasLocation ?Loc.
FILTER (?Loc = twi:RPI_Winslow_1148A)
}

Depending on the power of SPARQL and any additional tools that I can harness the actual query could be the biggest strength or weakness to the project. Second, I will need a SPARQL endpoint capable of returning the query results in a form that can be easily turned into a visualization. Fortunately, the current endpoint here works quite well and currently provides a good solution for returning the query results to a visualization script.

Finally, I will need a strong general visualization script. As seen in the earlier script it would be vital to have a general form and avoid using things like geo-maps until I am able to refine my queries to handle that information. I think realistically the script side should be very simple for a general visualization and could be scaled up to handle more dynamic visualization including things like geo-maps in the future but for now I would like to get something simple with a pie chart or bar graph to show the different foaf types in the rdf file. After some relatively simple modifications to the earthquake script I can create a script that will return a table of the data the SPARQL query generates:


<script type="text/javascript">
google.load('visualization', '1', {packages: ['table']});
function drawTable() {
var query = new google.visualization.Query('http://data-gov.tw.rpi.edu/ws/sparqlproxy.php?'+
'query-uri=http%3A%2F%2Fdata-gov.tw.rpi.edu%2Fsparql%2Fquakemap.sparql'+
'&output=gvds');
query.send(handleQueryResponse);
}
function handleQueryResponse(response) {
if (response.isError()) {
alert('Error in query: ' +
response.getMessage() + ' ' +
response.getDetailedMessage());
return;
}
var data = response.getDataTable();
var viz = document.getElementById('my_visualization_DIV');
new google.visualization.Table(viz).draw(data);
}
google.setOnLoadCallback(drawTable);


</script>

Resulting in the following table:
Table Visualization

Moving on from my project ideas and plans that were the main focus of my work this semester it is important to analyze why I wasn’t successful in spending enough time doing research to accomplish my goals. In the beginning of the semester I felt that I would be able to handle the additional work load of research with my courses because it seemed like my courses would be relatively easy and would not be project intensive. I cautiously signed up for the minimum amount of research because with 5 courses it was dangerous to sign up for more. During the semester my courses because more and more challenging and eventually I fell behind on not only research but course work as well. This could have been avoided with more careful planning and more research into the work-load of the courses I would be taking. In retrospect, I would have benefited from this semester much more if I had taken fewer courses and focused on research.

In continuing my research, I would like to devote a few hours each week during the summer towards further developing my ideas and project to continue contributing to the TWC. Despite being too busy and overwhelmed during the semester, I will be able to leave my other responsibilities at work during the summer and have ample free time to contribute to my project.

While I did still manage to spend some time working on research and made definite progress learning new visualization techniques and progressing my ideas for my project, I would like the opportunity to continue collaborating and working with the TWC. After a long busy semester I am looking to get back on the horse and re-establish myself as a researcher.

Posted in Uncategorized | Leave a comment