# # Created by Peter Krumins (peter@catonmat.net, @pkrumins on twitter) # www.catonmat.net -- good coders code, great coders reuse # # Released under GNU GPL # # Developed as a part of redditriver.com project # Read how it was designed: # www.catonmat.net/blog/designing-redditriver-dot-com-website # # This is a config file for mobile url autodiscovery Python module # # Rewrite links for which it is hard to extract print/mobile links # # rewrites word 'article' to 'print_article' in URLs matching online.wsj.com # -- can't rewrite this site, cause only older articles can be printed # todo: make a smarter engine, to find date of post, and if it is older # than certain time, then do the rewrite #REWRITE_URL online.wsj.com article print_article # rewrite msnbc news REWRITE_URL msnbc.msn.com /id/(\d+)/$ /id/\1/print/1/displaymode/1098/ # rewrite cbsnews # it doesn't quite work, as we access the link, it checks for referrer and # redirects back :( #REWRITE_URL cbsnews.com /main(\d+).shtml$ /printable\1.shtml # rewrite local6 news REWRITE_URL local6.com /news/(.+) /print/\1 # Search the page for PRINT_LINK # (Search is done case insensitive) # PRINT_LINK "format for printing" PRINT_LINK "printable version" PRINT_LINK "printable view" PRINT_LINK "print version" PRINT_LINK "print story" PRINT_LINK "print this story" PRINT_LINK "print page" PRINT_LINK "print this page" PRINT_LINK "print post" PRINT_LINK "print this post" PRINT_LINK "print this" PRINT_LINK "print this article" PRINT_LINK "print" # Ignore urls matching these regexes # IGNORE_URL \.(jpg|jpeg|gif|png|bmp|tif|tiff)$ IGNORE_URL \.(mp3|ogg|mp4)$ IGNORE_URL \.(flv|swf)$ IGNORE_URL \.(mov|avi|divx|xvid|wmv|rm|mpg|mpeg|mkv)$ IGNORE_URL \.(torrent)$ IGNORE_URL \.(pdf|ps|dvi)$ IGNORE_URL \.(tgz|gz|tar)$ IGNORE_URL flickr\.com