ocschwar: (Doggie)
[personal profile] ocschwar
Things I want to play with: PANDAS. Time series data. RESTful interfaces. Machine learning algorithms.

To that end, I downloaded the WADL spec for data coming from ISO-NE.com. It's complete. There is no point meddling with WADL here, as all the service is read-only for yours truly. The important part of the WADL spec is the < grammars %gt; section, which is an XMLSchema.

That schema is complete, so I need something to parse it. I just tried generateDS.py, which almost does everything I need. The first failing is that it generates Python objects from the XML data I receive, and the objects all have the right attribute names and values, but it does not retain information about attribute types. I specifically need to know which attribute types are unique IDs (location IDs, mostly) so that I can make them columns in the PANDAS files I'm generating. The other failing is that it can only parse XML inputs. The JSON inputs which ISO-NE also provides, I cannot parse yet.

Date: 2014-09-13 08:55 pm (UTC)
From: [identity profile] arcticturtle.livejournal.com
Really want sandman to work as an out-of-the-box RESTful server, but it's kind of in flux right now. You may want to look at it anyway. And of course there's HTSQL, which is a freaking amazing query language and - arguably - a nice REST server too.

Pandas rocks all socks.

Check out ddlgenerator in PyPI - it's kind of a mess but might do what you need very easily. And then once you've shoved that data into an RDBMS with ddlgenerator, you can use ipython_sql to query it out into Pandas DataFrames.

Date: 2014-09-13 09:15 pm (UTC)
From: [identity profile] ocschwar.livejournal.com
Oh, wow, ddl-generator looks pretty damn well suited for this task.

The data in question can be represented pretty intuitively as multisheeted spread sheets.

Each sheet belongs to a UID'd entity (e.g. a power generator or a transmission junction), each column on the sheet a numerical observation belonging to it, and each row a point in time.

If ddl-generator figures it out from the JSON output, that would pretty much rock.


ocschwar: (Default)

April 2017

16171819 202122

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Sep. 19th, 2017 05:13 pm
Powered by Dreamwidth Studios