Scraping scores off the nfl.com site.

dwmyers

Well-Known Member
Messages
2,373
Reaction score
522
I've thought about resurrecting an old power ranking formula I used to use in my college days, and being of a lazier mind set than I was as a young guy, I'd rather like my program to self load data, and if at all possible, from as authoritative source as I can. I'm going to present a small Perl program that does just that.

The program scrapes scores off nfl.com. It's tested in Linux, but I commonly run Perl programs on Win32 using the Active State product. You'll have to figure out how to load WWW::Mechanize using ActiveState's module loader. I know the knowledge is out there.

The code is as follows:
Code:
#!/usr/bin/perl
#
#
=head1 NAME

 get-nfl-scores.pl

=head1 AUTHOR

 Author   : dwmyers AT cowboyszone.com
 Date     : 9-14-2007
 Modified : N/A

=head1 SYNOPSIS

 get-nfl-scores.pl [optional-week-count]

 Scrapes NFL game scores off nfl.com for one week of play.

=head1 NOTES 

 Dependent on page layout. If NFL.COM changes page layout,
 the scrape will fail.

=head1 COPYRIGHT AND LICENSE

 This code is Copyrighted 2007 by dwmyers AT cowboyszone.com. 
 All rights are reserved.

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <http://www.gnu.org/licenses/>.

=cut

use warnings;
use strict;

use WWW::Mechanize;
use Pod::Usage;
use Getopt::Long;

my $debug = 0;
my $help = 0;
my $man = 0;

GetOptions('help|?' => \$help, 
           man => \$man,
           debug => \$debug ) or pod2usage(1);

pod2usage(1) if $help;
pod2usage(-verbose => 2) if $man;


my $sbox = qr/\"scoreBox\"/i;
my $teamlogo = qr{\"/teams/profile\?team=(\w\w?\w?\w?)\"}i;
my $teamscore = qr{\"scoresBoxTeamScore\"\>(\d\d?\d?)}i;

my $week = shift || 1;
die("The parameter should be a week number.\n") unless $week =~ /^\d\d?$/;
my $url = "http://www.nfl.com/scores?season=2007&week=Week+$week";
my $w = WWW::Mechanize->new();
my $r = $w->get($url);
my $content = $r->content();
$content =~ s/\n//g;
my @stuff = split /\</, $content;
my $fetch = 0;
my $done = 0;
my $team1 = "";
my $team1score = 0;
my $team2 = "";
my $team2score = 0;
my $lineno = 0;
for (@stuff) {
    $lineno++;
    if ( $_ =~ $sbox ) {
        $fetch = 1;
    }
    next unless $fetch;
    if ( $team1 eq "" && $_ =~ $teamlogo ) {
        my $logo = $1;
        print "logo found: logo = $logo\n" if $debug;;
        print "line is $_ && # is $lineno.\n" if $debug;
        $team1 = $logo;
        next;
    }
    if (  $team1 eq "" && $_ =~ $teamscore ) {
        my $score = $1;
        print "team1 scored $score points\n" if $debug;;
        print "line is $_ && # is $lineno.\n" if $debug;
        $team1score = $score;
    }
     
    if ( $team1 ne "" && $team2 eq "" && $_ =~ $teamlogo ) {
        my $logo = $1;
        print "logo found: logo = $logo\n" if $debug;
        print "line is $_ && # is $lineno.\n" if $debug;
        $team2 = $logo;
    }
    if ( $team2 ne "" && $_ =~ $teamscore ) {
        my $score = $1;
        print "$team2 scored $score points\n" if $debug;
        print "line is $_ && # is $lineno.\n" if $debug;
        $team2score = $score;
        $done = 1;
    }
    if ( $done ) {
        $done = 0;
        $fetch = 0;
        print "GAME SCORE: $team1 $team1score $team2 $team2score \n";
        $team1 = "";
        $team1score = 0;
        $team2 = "";
        $team2score = 0;
    }
}
 
Top