Never been to TextSnippets before?

Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world (or not, you can keep them private!)

« Newer Snippets
Older Snippets »
3 total  XML / RSS feed 

Scrape torrents on btjunkie

// download all .torrent links on the btjunkie frontpage
//
// more fun with mechanize @
// http://tramchase.com/scrape-myspace-youtube-torrents-for-fun-and-profit

agent = WWW::Mechanize.new
agent.get("http://btjunkie.org/")
links = agent.page.search('.tor_details tr a')
hrefs = links.map { |m| m['href'] }.select { |u| u =~ /\.torrent$/ } # just links ending in .torrent
FileUtils.mkdir_p('btjunkie-torrents') # keep it neat
hrefs.each { |torrent|
  filename = "btjunkie-torrents/#{torrent[0].split('/')[-2]}"
  puts "Saving #{torrent} as #{filename}"
  agent.get(torrent).save_as(filename)
}

Scrape MySpace friend thumbnails

// fetch all img's from a myspace profile's .friendSpace div
// more @ http://tramchase.com/scrape-myspace-youtube-torrents-for-fun-and-profit

agent = WWW::Mechanize.new
agent.get("http://myspace.com/graffitiresearchlab")
links = agent.page.search('.friendSpace img') # found w/ firebug
FileUtils.mkdir_p 'myspace-images' # make the images dir
links.each_with_index { |link, index| 
  url = link['src']
  puts "Saving thumbnail #{url}"
  agent.get(url).save_as("myspace-images/top_friend#{index}_#{File.basename url}")
}

Scrape YouTube thumbnails

// fun with mechanize
// more @ http://tramchase.com/scrape-myspace-youtube-torrents-for-fun-and-profit

agent = WWW::Mechanize.new
url = "http://gdata.youtube.com/feeds/api/standardfeeds/most_viewed" # all time
page = agent.get(url)
# parse again w/ Hpcricot for some XML convenience
doc = Hpricot.parse(page.body)
# pp (doc/:entry) # like "search"; cool division overload
images = (doc/'media:thumbnail') # use strings instead of symbols for namespaces
FileUtils.mkdir_p 'youtube-images' # make the images dir
urls = images.map { |i| i[:url] }
urls.each_with_index do |file,index|
  puts "Saving image #{file}"
  agent.get(file).save_as("youtube-images/vid#{index}_#{File.basename file}")
end
« Newer Snippets
Older Snippets »
3 total  XML / RSS feed