Wednesday, September 27, 2006

Found www.mappedup.com 

I've found a nice little service called www.mappedup.com showing blog entries as bubbles on a world map. It's in beta currently, as all the Web2.0 stuff.

Anyway, they really have some things to fix:
Tags:

Digg! del.icio.us
(0) comments

Tuesday, September 19, 2006

This week as a user, this week as a developer 

Example 1
A bug tracking system. SQL Server based.

Use case
: I want to do some analysis on the historical data of bug entries with some visualization BI tool.

Situation
: SQL Server schema is not documented, as far as I know. IT people claim that I'm not allowed to use the 'raw' DB interface. They say I should use the CSV export functionality provided in the client.

Solution
: Only workarounds. I need to do the export manually from the client UI, no automation possible. The export can't give me the historical data, only the current state of the bug entries. The BI tool can't deal with special delimiters when importing CSV, so I have to go the extra step of converting the data into Excel. Soo 19xx!

Example 2
A version control system. File system based.

Use case: I want to be informed about changes on a set of documents in the system, ideally by a data feed (RSS).

Situation: The system has a lot of CL tools to make 'queries' against it. All of them have a different syntax to tell the system what I would like to know, all of them produce different textual results. The system has a couple of UI tools. All of them with a different workflow to tell the system what I would like to know, all of them produce differently presented results which can't be exported finally. As far as I know it has absolutly no consistent query capabilities and tools.

Solution: I've found a tool called RSSBus, very smart, very nice idea to use RSS for integration and as a messaging system. That can be used to fetch the file metadata each time my feed aggregator requests the feed. Nice workaround, but from a performance perspective it would be a nightmare if all users would do it the same way. I assume the version control system would go down under heavy load. So 19xx!

Example 3
A License Server system. Not even considered to be a data system.

Use case: I want to do some analysis on the usage of the licenses of one of our systems. When is concurrent license maximum reached, what's the high watermark, what's the avery and what not. Should be used to decide, how many licenses we ould really need to pay.

Situation: The system has only a UI client which presents an ugly textbox with some partially helpful information.

Solution: Not even an initial one. I don't have a real chance to get the needed information although it could safe us multiple thousands of dollars. So 19xx!

Example 4
A debugging/profiling tool. File based in proprietary format.

Use case: Just wanted to see some advanced views on the data gathered during profiling.

Situation: The tool has really nice capabilities in terms of profiling a running application. When it comes to data analysis it has a fixed set of view with some pragmatic filter/sort features. But if you want to have some simple calulated views or aggregations you're lost. The vendor says, some of the features will come wih the next version of the tool. I mean, all I want to do is to be able to access my very own data with the tools I prefer.

Solution: They have a restricted text export feature. Again, that means doing manual exports, redundant data, text parsing and all of that. So 19xx!

Example 5
A project/product I worked on in the life science area. Oracle DB based.

Use case: A customer has a LIMS system where the results out of our measurement system should go to.

Situation: Although I think we have technically all in place to serve the customers need, the database as an open system to deal with data was never promoted to customers. Customers use a client Excel export to bring the data into their LIMS with Macros. That thing breaks every time the report template changes and hence the Excel stuctures change as well. The customer never asked about a more open/stable/automated solution to deal with the data.

Solution: Hm, none currently. This time I'm the developer seeing my customers in the very same situation as I was in examples 1-4. Who's the one to blame? Developers, who don't put the pressure on marketing to make the products really open systems? The marketing people, not realizing that a product should be more than a fancy UI these days? The customer, who knows the pain of data integration since years and gave up to ask for better interfaces? When do we start to be a bit more 20xx?

Digg! del.icio.us
(0) comments

Sunday, September 17, 2006

Fun with Powershell and Shoutcast.com 

What do you do when it's late summer, you had week full of 'meta stuff' without any hacking and your girl friend is at their parents place? *ggg* I took my new discovery Powershell and wanted to do something useful. Ok, useful in a way not everyone would agree...

Ok, I have a Buffalo Linktheater, the media server normally is a Buffalo Linkstation. More and more I do hear Internet Radio. So far I copied the pls files manually to the linkstation and I can hear the audio streams by selecting the files on the linktheater. Wouldn't it be fine to have all the shoutcast stations on my linktheater?

Hm. Let's see. My first idea was to do some HTML screen scraping. Bad idea, the HTML returned is no XHTML which can be casted to XML, so the following went wrong:

 1: $Url = "http://www.shoutcast.com"
 2: $webClient = (new-object System.Net.WebClient)
 3: $xml = [xml]$webClient.DownloadString($Url)

Next try. Shoutcast may have some API to get the stations as XML. Some initial google search gave

 1: $Url = "http://www.shoutcast.com/sbin/xmllister.phtml"
 2: $webClient = (new-object System.Net.WebClient)
 3: $xml = [xml]$webClient.DownloadString($Url)
 4: $xml.WinampXML.playlist.entry
 5:  
 6: Playstring : http://localhost:8000
 7: Name : UPGRADE NEEDED! Please ask your software authors to upgrade their SHOUTcast directory support!
 8: Genre : Please Upgrade
 9: Nowplaying : Please Upgrade
 10: Listeners : 1
 11: Bitrate : 128

Hm. Something's going on here. Reading this thread on the Winamp forum I learned how to get at least the top 30 stations

 1: $Url = "http://www.shoutcast.com/sbin/xmllister.phtml?service=winamp2&no_compress=1"
 2: $webClient = (new-object System.Net.WebClient)
 3: $xml = [xml]$webClient.DownloadString($Url)
 4: $xml.WinampXML.playlist
 5:  
 6: num_entries label entry 
 7: ----------- ----- ----- 
 8: 30 SHOUTcast top 30 {radioparty.pl - najlepsza klubowa m...

Fine. That could be a starting point. But it is not what I wanted. I wanted all the 15000 stations, not 30. Digging a bit deeper I realized that the shoutcast list format has changed meanwhile. There is a service to retrieve all the genres, with that you can retrieve all the stations per genre and finally built up the URL retrieve the pls file.

 1: $genres = [xml](new-object System.Net.WebClient).DownloadString("http://www.shoutcast.com/sbin/newxml.phtml")
 2: $genres.genrelist
 3:  
 4: genre 
 5: ----- 
 6: {24h, 60s, 70s, 80s...} 
 7:  
 8: $Url = "http://www.shoutcast.com/sbin/newxml.phtml?genre=70s"
 9: $stations = [xml](new-object System.Net.WebClient).DownloadString($Url)
 10: $stations.stationlist
 11:  
 12: tunein station 
 13: ------ ------- 
 14: tunein {!KICKRADIO the 80s Channel. Only 80's the whole day, Bi...
 15:  
 16: $Url = "http://www.shoutcast.com/sbin/tunein-station.pls?id=158194"
 17: $pls = (new-object System.Net.WebClient).DownloadString($Url)
 18: $pls
 19:  
 20: <br />
 21: <b>Notice</b>: Undefined index: HTTP_USER_AGENT in <b>/usr/local/apache/htdocs/sbin/tunein-station.pls</b> on line <b>22</b><br />
 22: <br />
 23: <b>Notice</b>: Undefined index: HTTP_USER_AGENT in <b>/usr/local/apache/htdocs/sbin/tunein-station.pls</b> on line <b>34</b><br />
 24: <br />
 25: <b>Warning</b>: Cannot modify header information - headers already sent by (output started at /usr/local/apache/htdocs/sbin/tunein-station.pls:22) in <b>/usr/local/apache/htdocs/sbin/tunein-station.pls</b> on line <b>44</b><br />
 26: [playlist]
 27: numberofentries=1
 28: File1=http://81.4.95.91:8000
 29: Title1=(#1 - 46/250) !KICKRADIO the 80s Channel. Only 80's the whole day
 30: Length1=-1
 31: Browser1=http://www.winamp.com/bin/sc/sccontext.php?host=81.4.95.91:8000&title=!KICKRADIO the 80s Channel. Only 80's the whole day&slots=46&genre=80s 70s Pop&url=http%3A%2F%2Fwww.kickradio.nl
 32: Version=2

That looks as expected. After tweaking these lines a bit to create the wanted directory structure on the linkstation and removing illegal characters out of the station names which were used to generate the pls file names I encountered that I get a 'Too many requests. Try again tomorrow.' response from the shoutcast server. Hm. They really blocked my IP after some hundred requests. Ok, so I extended the script to fetch the audio stream details whenever it gets a useful reponse. So running the script multiple times should result in a complete list over time. The script currently looks like this:

 1: function ConvertTo-PathString
 2: {
 3: param ([string] $value = $null)
 4:  
 5:  $value.Replace("\*","%2A")
 6: }
 7:  
 8: function ConvertFrom-PathString
 9: {
 10:  $_.Replace("%2A","\*")
 11: }
 12: $DomainUrl = "http://www.shoutcast.com"
 13: $BaseUrl = "/sbin/newxml.phtml"
 14: $targetRoot = "X:\playlists\shoutcast.com"
 15: $Tmr = "Too many requests. Try again tomorrow."
 16:  
 17: $WriteErrors=0
 18: $RetrieveErrors=0
 19: $Successes=0
 20: $Existing=0
 21:  
 22: $webClient = (new-object System.Net.WebClient)
 23:  
 24: $xml = [xml]$webClient.DownloadString($DomainUrl+$BaseUrl)
 25:  
 26: $counter = 0
 27: $xml.genrelist.genre | foreach {
 28:  $counter++
 29:  write-Progress $_.name starting -Id 0 -PercentComplete ($counter/$xml.genrelist.genre.Count*100) 
 30:  $genrexml = [xml]$webClient.DownloadString($DomainUrl+$BaseUrl+"?genre="+$_.name)
 31:  
 32:  ## create a directory structure by genre having a 2 step hierarchy <FirstCharacter>/<GenreName>
 33:  if ($_.name[0] -match "[A-Z,a-z]")
 34:  {$targetDir = new-Item -path $targetroot -name (ConvertTo-PathString $_.name[0]) -type directory -force} 
 35:  else 
 36:  {$targetDir= get-item $targetroot}
 37:  $directory = new-Item -path $targetDir.FullName -name (ConvertTo-PathString $_.name) -type directory -force
 38:  
 39:  $stationCounter = 0
 40:  
 41:  ## walk through the stations of single genre, we retrieve stations with bitrate >= 128 only
 42:  $genrexml.stationlist.station |Where-Object {$_.br -ge 128} | foreach {
 43:  $stationCounter++
 44:  write-Progress $_.name Executing -Id 1 -PercentComplete ($stationCounter/$genrexml.stationlist.station.Count*100) -ParentId 0 
 45:  $sourceUrl = $DomainUrl+$genrexml.stationlist.tunein.base+"?id="+$_.id
 46:  $targetPath = join-Path $directory.FullName ((ConvertTo-PathString $_.name)+".pls")
 47:  if ((test-Path $targetPath) -eq $false)
 48:  {
 49:  $pls = $webClient.DownloadString($sourceUrl)
 50:  ## $pls="dummy"
 51:  if ($pls -ne $Tmr)
 52:  {
 53:  set-content $targetPath $pls
 54:  "$targetPath written."
 55:  $Successes++
 56:  trap [System.Exception]
 57:  {
 58:  "$targetPath coudn't get written."
 59:  $WriteErrors++;
 60:  continue;
 61:  } 
 62:  }
 63:  else
 64:  {
 65:  "$targetPath coudn't get retrieved."
 66:  $RetrieveErrors++
 67:  }
 68:  }
 69:  else
 70:  {
 71:  "$targetPath exists."
 72:  $Existing++
 73:  }
 74:  }
 75:  }
 76: "Successes : $Successes"
 77: "Existing : $Existing"
 78: "WriteErrors : $WriteErrors"
 79: "RetrieveErrors: $RetrieveErrors"
 80: "finished."

Not the cleanest piece of code, but it seems to do the what expected. Was a lot of fun!



Digg! del.icio.us
(2) comments

This page is powered by Blogger. Isn't yours?

LICENSE