Showing posts with label blogger. Show all posts
Showing posts with label blogger. Show all posts

Saturday, March 13, 2010

Posterous Blogger Sidebar Widget Thumbnail Feed Script

It's been a while since this blog actually lived up to its name and I posted something to do with actual hacking on my Linux box.

You may recall a post a while back where I used the 'sed' command to create a modified copy of my TwitPic feed so that a thumbnail would show up when I imported the feed into a Blog List gadget in Blogger.

Well, I recently switched from using TwitPic for uploading pictures from my phone to using Posterous for uploading pictures and video from my phone.  There were many reasons in the "pros" column, but in the "cons" was the fact that, when I imported my feed into that same Blogger widget, no thumbnail appeared.

So, just like with the TwitPic feed, I set out to modify my Posterous feed in order to get the thumbnail to appear.  One problem I encountered is that the feeds were totally different formats.  I based my TwitPic feed modification on a feed I knew to be working (from Digg), but performing that same transformation on the Posterous feed proved to be problematic.

What I ended up doing was simply extracting the information I needed from the Posterous feed, and then creating a one-item feed in the known-good format.  The feed looks nothing like the original Posterous feed, but that's just fine, since all it will be used for is pulling the latest post into my blog sidebar.

One improvement I'm considering working on is providing a useful thumbnail when I upload a video.  Currently (at least with the 3gp format), the Posterous feed just sticks a generic blank file icon in the thumbnail field.  What I would like is a still frame from the movie.  In order to do this myself, I would need to download the enclosure link, process the video into a still image, post the image on the web, and then put the image URL into the feed.  All very doable given the right tools.

I'll have to test out what happens when I use the MP4 format for video, which my phone is also capable of creating.

Here's my script (so far).  Feel free to use it under the terms of the license listed below.  If you have any questions or suggestions, please feel free to leave a comment.

posterous.sh (run as an hourly cron job):
#!/bin/sh

# Copyright 2010 Tim "burndive" of http://burndive.blogspot.com/
# This software is licensed under the Creative Commons GNU GPL version 2.0 or later.
# License informattion: http://creativecommons.org/licenses/GPL/2.0/

# This script was obtained from here:
# http://tuxbox.blogspot.com/2010/03/posterous-blogger-sidebar-widget.html

DOMAIN=$1
FEED_DIR=$2
FEED_TITLE=Posterous
FEED_DESC="The purpose of this feed is to provide a thumbnail of the latest item in a Blogger sidebar widget."


if [ -z $DOMAIN ]; then
  echo "You must enter a Posterous DOMAIN."
  exit
fi

if [ -z $FEED_DIR ]; then
  echo "You must supply a directory."
  exit
fi

if [ ! -d $FEED_DIR ]; then
  echo "You must supply a valid directory."
  exit
fi

FEED_URL="http://$DOMAIN/rss.xml"
TMP_FILE="/tmp/posterous-$DOMAIN.xml"
FEED_FILE="$FEED_DIR/posterous-$DOMAIN.xml"

# Fetch the RSS feed
wget -q $FEED_URL -O $TMP_FILE

if [ ! -f $TMP_FILE ]; then
  echo "Failed to download $FEED_URL to $TMP_FILE"
  exit
fi

NEW_LATEST=`grep guid $TMP_FILE | head -n1`

if [ ! -f $FEED_FILE ]; then
  FEED_LATEST="" 
else 
  FEED_LATEST=`grep guid $FEED_FILE | head -n1`
fi

# Comment these out
#echo "FEED_LATEST: $FEED_LATEST"
#echo "NEW_LATEST : $NEW_LATEST"

if [ "$FEED_LATEST" = "$NEW_LATEST" ]; then
#  echo "There is no change in the feed."
#  echo "FEED_LATEST: $FEED_LATEST"
  exit
fi

IMG_HTML=`grep -i "img src" $TMP_FILE | head -n1 | grep -Eo "<img src='[^']*'[^>]*>" | sed -e 's/\"/\&quot;/g' -e 's/</\&lt;/g' -e 's/>/\&gt;/g'`
#echo "IMG_HTML: $IMG_HTML"

IMG_URL=`grep -i "img src" $TMP_FILE | head -n1 | grep -Eo "http:[^']*" | tail -n1`
#echo "IMG_URL: $IMG_URL"

# Create a minimalist RSS feed
echo "<?xml version='1.0'?> " > $FEED_FILE
echo "<rss version='2.0' xmlns:media='http://search.yahoo.com/mrss/'>" >> $FEED_FILE
echo "<channel>" >> $FEED_FILE
echo "<title>$FEED_TITLE</title>" >> $FEED_FILE
echo "<description>$FEED_DESC</description>" >> $FEED_FILE
echo "<link>http://$DOMAIN/</link>" >> $FEED_FILE

echo "<item>" >> $FEED_FILE
grep "<title>" $TMP_FILE | head -n2 | tail -n1 >> $FEED_FILE
grep "<pubDate>" $TMP_FILE | head -n1 >> $FEED_FILE
echo "<description>$IMG_HTML</description>" >> $FEED_FILE
grep "<link" $TMP_FILE | head -n3 | tail -n1 >> $FEED_FILE
echo "$NEW_LATEST" >> $FEED_FILE
echo "<media:thumbnail url=\"$IMG_URL\" height=\"56\" width=\"75\" />" >> $FEED_FILE
echo "</item>" >> $FEED_FILE

echo "</channel>" >> $FEED_FILE
echo "</rss>" >> $FEED_FILE

# Cean up
rm $TMP_FILE
CC-GNU GPL
This software is licensed under the CC-GNU GPL version 2.0 or later.

Monday, April 13, 2009

That's What She sed

Lately, I've been uploading pictures to Twitter from my phone using TwitPic. Basically, you send them to a TwitPic e-mail address via multimedia messaging, and they are automatically posted to your Twitter account, along with the text from the subject. This all works quite well, and they even supply an RSS feed of your pictures, which you can take and (among other thing) put on your blog's sidebar. The problem was, when I put it in my blog sidebar, there was no thumbnail image. Other feeds that had images in them would have thumbnails, but not this one. This one just had a text link to the picture page. I found that disappointing. So I examined feeds that showed thumbnails and the TwitPic feed to see what the difference was. Feeds that contained images within the feed content showed up in the Blogger widget with a thumbnail. But the TwitPic feed showed images. What was the difference? The difference turned out to be CDATA. CDATA is a way to tell a feed reader, "Don't try to decipher my contents, just pass them along and leave the rendering to the end user application." It so happens that TwitPic's thumbnail images are within a CDATA block, and Blogger obediently ignores the CDATA contents when looking for images to display as a thumbnail. So, how do I fix that? I need to read the feed, and for each item, locate the line that contains the thumbnail URL, and create a new attribute containing the thumbnail in a format that is decipherable to Blogger's widget. Using my digg feed as a model, I figured out what the end result should look like, but how to achieve it? First, I tried Yahoo Pipes. Yahoo has a tool for processing feeds with a number of tools, controlled by a graphical pipe-looking interface. The problem is, none of the tools that I could find would add an attribute based on the transformed contents of another attribute. There were widgets that came close, but I couldn't get it to work, so I decided to host the feed myself and modify it using sed. I had never used sed before, except when the exact command was given, so I didn't know how to use it, but I knew that it was a powerful enough tool to get the job done. So I created a shell script on my Linux box, and a cron job to run it. The script basically downloaded the RSS feed from TwitPic to a local file, and then called sed on it with a particular set of parameters designed to extract the necessary information, and add the appropriate information in a format that is decipherable to Blogger. In order to understand sed, I searched the Internet for a tutorial, and found this page from the Gentoo Linux Documentation to be the most helpful. My sed command does two things, which are piped together:
  1. It adds an xmlns:media declaration, which allows me to use the media tag later on.
  2. It examines each CDATA line with the thumbnail URL, and below it, it adds a line with the media:thumbnail tag and the URL extracted from above.
sed -e 's/<rss version="[^"]*"/& xmlns:media="http:\/\/search.yahoo.com\/mrss\/"/g' $TMP_FILE | sed -e 's/\(http:\/\/twitpic.com\/show\/thumb\/[^"]*\).*/&\n <media:thumbnail url="\1" height="150" width="150" \/>/g' > $FEED_FILE
I know it's possible to consilidate the two sed commands into one and do it in one pass, but this works. I may tweak it in further revisions. It is not necessary to use a yahoo-defined media tag, so I might modify the script later on to simply transform the CDATA portion into parseable encoded HTML. I might also add that I'm using Feedburner to host the feed. Basically, I change the file on my server, and Feedburner goes there to get it, and offers it to the rest of the world. That way if my server is offline, the feed is still active and available, and I don't have to deal with the traffic, just the Feedburner hits. If anyone else wants their TwitPic feed to have thumbnails available, let me know, and I can set one up for you on my server through Feedburner. (It's pretty easy, since the TwitPic username is passed in to the script as a parameter). I can't guarantee anything, but since it's in my interest to keep the script working and up-to-date, you don't have much to worry about. All I need to know is your TwitPic (Twitter) username.
  • Update (2009-04-16): I have modified the code to accept all image formats, and be shorter.

Thursday, March 05, 2009

Google Friend Connect

I just published a post on my other blog about Google Friend Connect (and Blogger's "Follow") feature. I describe what they are, and how they work, as far as I can tell. I decided to post it over on my non-technical blog, because it's targeted at your average Internet user, and doesn't require any technical knowledge to follow (at least that's what I think). I would encourage those who enjoy my blogs to "Follow" them.

Wednesday, December 10, 2008

Blogger.com and the Broken Comment Form: Third Party Cookies

A while ago, I noticed that, for some reason, when I was on a Blogger blog on the blogspot.com domain (such as this one), I was no longer seen as "logged in", even though when I went to the Blogger home page, or clicked the "Sign in" link on the upper right hand corner (or the "B" icon on the upper left), I was automatically signed in with my Google account. This was only a minor annoyance to me (it prevented the "edit" icons from showing up for me on the widgets my own blogs) until I started using the in-line comment form for my blogs. This feature puts the comment form right below the post on its page, in stead of opening a separate window or navigating away to a special comments page. I like it because it allows for immediate comment gratification, and a smooth user experience. The problem was that now my comment forms seemed to be broken if I was signed in with my Google account. (Google owns Blogger, and they transitioned away from the old Blogger IDs to Google accounts a while ago.) After doing some research on this problem, I came to the conclusion that the source of the problem was Firefox: specifically that Firefox (since version 3.0) has the default setting of blocking all third-party cookies. (However, if you upgraded your profile from a previous version, you may still have the old default setting left over.) A cookie is a bit of text that a web site can store (in order to read back later) to track information about a visitor to that website. A third-party cookie is a cookie that is loaded by a script that is not hosted on the same domain as the site that you are visiting. Most third party cookies come from advertisers, whose ads are loaded and, in the background, like to keep track of which ads a user has seen and at which sites they have visited. There are a few companies that like to collect as much information as possible, and the fact that they have their advertising tendrils on so many sites gives them a disturbingly extensive ability to track users' browsing habits. This is why the makers of Firefox decided to block third party cookies, except for those sites that the user has specifically granted permission to allow access from other sites. I agree with their assessment, and don't want to re-enable all third-party cookies, but I also want to allow certain sites that I trust to know who I am. Blogger is not nefarious in its desire to "track" me on blogs: it simply wants to let me log in, manage my blogs, and post comments. So, how do I let Firefox know that blogger.com should be allowed to track me on non-blogger.com sites? Here is the answer: Go to the settings menu. This can be found under Tools -> Options on Windows and Edit -> Preferences on Linux. I'm not sure where it is on Mac OS X, but I'm sure it's not hard to find. Once you're there, select the Privacy panel, and under "Cookies", click on the "Exceptions..." button:
Type "blogger.com" into the text field and click "Allow":
You're done. Click Close and exit the settings menu.
The next time you log in to Blogger, external blogger sites (such as those on blogspot.com, as well as custom domains) will know who you are.

Thursday, June 21, 2007

Web Feeds and Aggregators: Thoughts

I just "discovered" Google Reader. Oh, no, I knew it was there all along. It's even one of Firefox's default feed subscription options. I had simply been ignoring its existence this whole time, content to use Firefox's Live Bookmarks feature for all my RSS/Atom needs. For the uninitiated, a "web feed" is a way to "subscribe" to the content of a website, such as a blog, news outlet, podcast, or just about anything these days. After subscribing to a web feed, a visitor is automatically notified of new content on that website by their feed reader of choice. There are quite a few out there, including Firefox's Web Feeds feature, Google Reader, the Opera Browser, and Thunderbird. Basically, in stead of having to go to every website to see if there is new content available, the reader can subscribe to the websites' feeds, and will be automatically notified of any new content on each site. The problem is that with some feed aggregators, they simply pull all of your content off the site, and allow the readers to get the content without visiting the site. This becomes problematic for ad-supported websites, which typically either draw the readers to the site by providing unique participatory content, such as a discussion forum or comments, by only providing a summary of the actual content in the feed, or by injecting ads into the feed. I am not an ad-supported website, but I do like my readers to interact with me and each other through comments. If none of the users are drawn to my actual website, then none of them will see each other's comments. The other thing I like to do is keep track of roughly how many people are reading my blog, and blogger doesn't provide tools to track users on the site itself, much less the feed. They do provide a mechanism to insert something into the feed at the bottom, which could be used to tally readers. What I have been doing is having the feed only contain the first paragraph or so of the post, and then the readers are directed to the post's actual page. This may prove inconvenient for some readers, although I was trying out Thunderbird, and what it did with my blog was to simply load the post's page directly into the reading frame, which is actually ideal from my perspective. Other readers, particularly aggregators such as Google Reader only display the text and image content, and use their own formatting. I have been considering switching the feed to contain the entire post, but I'm not yet sure. What do you think? Is anyone actually reading this? Do you use web feeds? What reader(s) do you use? Do you prefer to have blogger format the post, or do you prefer your reader's formatting? I have decided that for the time being I will try out Google Reader for all of my friends' blogs, and for newsletters that I read every time, but for news sites where I tend to cherry-pick the articles, I'm sticking with Firefox's live bookmarks: it gives you a menu of the latest posts, with the ones you've read already grayed out. I wouldn't want my feed list to get clogged with every article on Ars Technica, Slashdot, Technocrat, and certainly not Digg.

Wednesday, March 28, 2007

Wordpress

When I switched from the old Blogger to the New Blogger beta, I posted about some of the changes that they had made, and also touched on some of the things I liked and didn't like about the changes. Since then, I've had some time to work with the new Blogger, and I've noticed some annoyances. Recently, I was (slightly) involved in beck switching her blog from Typepad to Wordpress. In researching Wordpress' features, I noticed a few things that I liked, and I also recently noticed that it is possible to move an entire blog, comments and all, from Blogger to Wordpress without much trouble. So the question arises, should I switch to Wordpress? Here are some things that I like about blogging with Wordpress over Blogger:
  • Big Brother: Google gets to corrolate my blog with my search history, e-mail, my Google Checkout purchase history, etc. There are some parts of "the world's information" that I would like to keep unorganized and inaccessible, thank-you-very-much.
  • Login issues: whenever I log in to my e-mail account and the session expires, I get logged out of my blog. This is annoying, and it didn't happen before my Blogger account was absorbed into my Google account.
  • Web Statistics: Wordpress gives you excellent statistics, not only on traffic to your blog, but also on how many people are subscribed to your feeds. I use Webalizer and ClustrMaps to get something similar in nature, but I have no idea how many people are subscribed to my feeds or where else they come from.
At the same time, there are things that I still like about Blogger over Wordpress:
  • It requires no change: I'm already doing it.
  • Uploading pictures to Blogger posts with Picasa: it's easy, and the hosting is free (to a point).
  • Wordpress makes labels (tags/categories) as I use them less convenient, or at least so I hear. This is one feature that Blogger does quite well.
So far, I'm not annoyed enough to switch. If Google fixed the login/logout issues, I would be a lot less annoyed. If Wordpress were to start supporting the OpenID specification, it would be even more appealing, especially if Google didn't.

Wednesday, January 24, 2007

New Blogger Layout

You may have noticed that my blog now looks like everyone else's, with the navigational bar on the right. You may also have noticed other changes. These are all due to the fact that I just changed over from the old Blogger to the new Blogger, with its new layout engine. Before, I was able to use the same template on both of my blogs, simply by copying and pasting the template HTML from one blog to the other. With the new WYSIWYG editing, this is not possible, because the modules are unique to each blog, and copying a template which references nonexistent modules is a bad idea. I was able to keep most of the same page elements, though I have since removed some of the lists of links from this blog so that I only have to maintain one list. From this point, I expect the content of the nav bars to grow apart, so I'm paring them down to what makes sense for each. The advantages of the new layout engine over the old are:
  • Labels
  • Comment feeds
  • Graphical layout and template content management
  • Dynamic pages: instant publishing
One thing that I don't like as much about the whole "New Blogger beta" is the fact that it's tied to my Google account. Now, not only does Google have all of my e-mail indexed, it "knows" that the same person owns these two blogs, and performs the searches that I perform while logged in. It also has a few purchases associated with my account, thanks to Google Checkout. I'm not paranoid or anything, I just don't like that the same company has all of this information on me. It increases their confidence that they can predict what I will like (which no doubt drives the AdSense ads that I see while online), but that also decreases my freedom to dictate how I am perceived by the websites I use. Obviously, I could use a separate account for all these things, which is pretty much what I was doing before I started using my Google account with Blogger, but there's the convenience factor. That, combined with the $20 off $50 deals they were offering is why I use Google Checkout. The world would be a more convenient place if all websites had a single login in order to make a purchase in stead of a separate account at each, but I would rather that my bank manage that account, and that my bank not also have access to all of my e-mail and online musings, conveniently tied together in the same account. Don't get me wrong, I'm rooting for Google against Yahoo, Microsoft, and the other portals because they generally do things right, but that doesn't mean I'm going to drink their Kool-Aid and trust them blindly. I may yet decide to create separate dedicated accounts for blogging, e-mail, and purchases. It's so convenient not to, though.