RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$tags=array('title','content:encoded','link','pubDate');
$lockergnome=xml_parser("http://feed.lockergnome.com/nexus/all",$container,$tags,5,'hide');

$tags=array('title','content:encoded','link','pubDate');
$lockergnome=xml_parser("http://feed.lockergnome.com/nexus/all",$container,$tags,5,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>lockergnome</h2>
<?php echo $lockergnome1; ?>
<h2>lockergnome</h2>
<?php echo $lockergnome2; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, Lockergnome - with CDATA stripped and shown - and flickr).

bbc

New phone hacking inquiries call
Senior Labour politicians urge fresh inquiries into phone hacking claims surrounding the News of the World newspaper.
Date: Fri, 03 Sep 2010 22:39:10 GMT
Blair in 'radical Islam' warning
Former Prime Minister Tony Blair tells the BBC that radical Islam is the greatest threat facing the world.
Date: Fri, 03 Sep 2010 23:41:31 GMT
Earthquake hits south New Zealand
A state of emergency is declared in Christchurch after a 7.0-magnitude earthquake strikes New Zealand's South Island, injuring two people seriously.
Date: Fri, 03 Sep 2010 23:43:49 GMT
Police question Pakistan players
Police question the three Pakistan players accused of corruption, while the ICC says that trio implicated have a disciplinary case to answer.
Date: Sat, 04 Sep 2010 02:21:02 GMT
Taxpayers 'should not fund Pope'
Some 77% of Britons think taxpayers should not help pay for Pope Benedict XVI's visit to Scotland and England, a survey suggests.
Date: Sat, 04 Sep 2010 00:36:38 GMT
Tennessee mosque fire 'was arson'
A fire that damaged construction equipment at the site of a Tennessee Islamic centre was arson, investigators say.
Date: Sat, 04 Sep 2010 00:17:27 GMT
Poll 'backs move from New Labour'
A poll commissioned by Ed Miliband's leadership campaign finds voters are less likely to vote Labour if there is not a shift from New Labour policies.
Date: Sat, 04 Sep 2010 01:44:57 GMT
Bank customers in 'dire poverty'
Banks are accused of leaving some customers in "dire poverty" after taking money out of their accounts without permission.
Date: Fri, 03 Sep 2010 23:02:20 GMT
Worshippers 'just escaped blast'
A Hare Krishna temple in Leicester was evacuated seconds before an explosion almost destroyed the building, it emerges.
Date: Fri, 03 Sep 2010 21:33:06 GMT
Pakistan rally bomb kills dozens
A bomb kills at least 50 people at a Shia Muslim rally in the south-western city of Quetta, the second attack on Pakistan's religious minority in days.
Date: Fri, 03 Sep 2010 18:33:22 GMT

cnn

Earl downgraded as wind, rain hit northeastern U.S.
Earl was downgraded to a tropical storm as it spread wind and rain over Long Island and part of New England. Earl still had maximum sustained winds of 70 mph.
Date: Fri, 03 Sep 2010 22:43:06 EDT
Mistrial in alleged 'sham marriage' case
The sham marriage trial of actress Fernanda Romero, which the judge has likened to a soap opera, appeared threatened with a mistrial Friday after a dramatic turn a day earlier.
Date: Fri, 03 Sep 2010 22:37:45 EDT
Feds bust huge human-trafficking ring
Six job recruiters have been indicted in federal court in what the FBI has called the largest human-trafficking operation ever to result in charges in the United States.
Date: Fri, 03 Sep 2010 16:36:35 EDT
UPS plane crashes near Dubai, kills 2
A cargo plane has crashed in an uninhabited area near the Dubai airport, according to the official WAM news agency in the United Arab Emirates.
Date: Fri, 03 Sep 2010 17:56:15 EDT
Florida love triangle killer sentenced
A Florida judge sentenced Rachel Wade, the 20-year-old woman convicted of second-degree murder for fatally stabbing her romantic rival in a fight last year, to 27 years in prison Friday.
Date: Fri, 03 Sep 2010 20:02:35 EDT
Opinion: Push jobs bill, progressives
Now that we are a week removed from the march on Washington organized by the self-proclaimed rodeo clown, Glenn Beck, it's clear that the event was nothing more than an exercise in ego worship.
Date: Fri, 03 Sep 2010 16:57:00 EDT
Mexicans: U.S. cartoonist went too far
An American's cartoon showing the eagle in the Mexican flag dead in a pool of blood is drawing criticism.
Date: Fri, 03 Sep 2010 21:17:59 EDT
Acid victim: Shades, God saved eyes
Bethany Storro doesn't usually wear sunglasses, but she got a surprise paycheck and bought a pair earlier this week. Those sunglasses, she is convinced, saved her eyesight when a woman threw a cup of acid in her face 20 minutes later.
Date: Fri, 03 Sep 2010 18:35:43 EDT
Girl, 4, weighed 15 pounds at death
The mother of a 4-year-old girl, found dead in her Brooklyn home Thursday morning, was charged Friday with second-degree assault, reckless endangerment and endangering the welfare of a child, according to police.
Date: Fri, 03 Sep 2010 22:26:36 EDT
Religious leaders hit back at Hawking
After physicist Stephen Hawking's claim that God didn't create the universe, the head of the Church of England says that "physics on its own will not settle the question of why there is something rather than nothing."
Date: Fri, 03 Sep 2010 16:40:56 EDT

lockergnome (hidden CDATA)

The lockergnome feed seems to be down.

lockergnome

The lockergnome feed seems to be down.

flickr

DSC00807

keteepe2010 posted a photo:

DSC00807

Date: 2010-09-04T03:45:03Z
019_7_12_10

New Leaders Council posted a photo:

019_7_12_10

Date: 2010-09-04T03:45:05Z
Duyen Dang Viet Nam

•Linh Nhi Việt Nam• posted a photo:

Duyen Dang Viet Nam

Date: 2010-09-04T03:45:05Z
0515Exercise

go_adb_go posted a photo:

0515Exercise

Date: 2010-09-04T03:45:06Z
.

annaliviams posted a photo:

.

Date: 2010-09-04T03:45:02Z
277

Noling- posted a photo:

277

Date: 2010-09-04T03:45:02Z
DSC_0884

crichgraphics posted a photo:

DSC_0884

Date: 2010-09-04T03:45:03Z
IMG_3736

spclzd posted a photo:

IMG_3736

Date: 2010-09-04T03:45:03Z
DSC_0025

Meredithfp posted a photo:

DSC_0025

Date: 2010-09-04T03:45:04Z
sleepin billie

melissaox posted a photo:

sleepin billie

Date: 2010-09-04T03:45:04Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

Comment form

Please type the word 'whisky' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url