I noticed Yahoo Pipes pretty soon after it launched nearly two years ago but other than aggregating RSS feeds I couldn’t really get into it. The visual psuedo-flow-diagram programming didn’t gel with me at all and I just found myself thinking I could have more easily achieved the same results in less time in PHP.
Well, after chatting with the Yahoo Developers booth at FOWA the other week and this week failing to find a decent UK Xbox 360 release calendar in iCal format I decided to have another crack using the messy data on Gamestation’s website as a source. The result: Gamestation XBox 360 releases. I’ve also created a conversion script in PHP to make that data iCal compatible and therefore importable into pretty much any calendaring system you’d care to mention.
If you’re used to writing screen scraping software in a real programming language there are a few mental barriers you have to clear in order to get anywhere:
- There is no regexp “match” function. You have to use substitute if you want to extract a term and make sure you “.*” at either end of it to remove unwanted text.
- You have to program loops one after the other instead of nesting logic all inside one big loop. Makes it a very odd thing to read.
- Multiline regexps are a headache to write because you have a single-line regexp input. I got around this by making a multi-line string single line with “s/[\r\n]/” before doing proper regexp operations
- It’s really quite buggy. Under Firefox in windows drag and drop operations would stop working after a while and under Firefox in Ubuntu after a while the whole edit area would turn into a bit of a mess. Save frequently.
Anyway, it’s pretty good once you get your head around it and it outsources some of the heavy lifting involved in screen-scraping to Yahoo instead of your own site. I’d like to see the ability to export in iCal format as standard in future updates.