I am just starting with the mentioned Parser and somehow running on problems directly with the beginning.
Referring to this tutorial:
//net.tutsplus.com/tutorials/php/html-parsing-and-screen-scraping-with-the-simple-html-dom-library/
I want now simply find in a sourcecode tne content of a div with a class ClearBoth Box
I retrieve the code with curl and create a simple html dom object:
$cl = curl_exec[$curl];
$html = new simple_html_dom[];
$html->load[$cl];
Then I wanted to add the content of the div into an array called divs:
$divs = $html->find['div[.ClearBoth Box]'];
But now, when I print_r the $divs, it gives much more, despite the fact that the sourcecode has not more inside the div.
Like this:
Array
[
[0] => simple_html_dom_node Object
[
[nodetype] => 1
[tag] => br
[attr] => Array
[
[class] => ClearBoth
]
[children] => Array
[
]
[nodes] => Array
[
]
[parent] => simple_html_dom_node Object
[
[nodetype] => 1
[tag] => div
[attr] => Array
[
[class] => SocialMedia
]
[children] => Array
[
[0] => simple_html_dom_node Object
[
[nodetype] => 1
[tag] => iframe
[attr] => Array
[
[id] => ShowFacebookButtons
[class] => SocialWeb FloatLeft
[src] => //www.facebook.com/plugins/xxx
[style] => border:none; overflow:hidden; width: 250px; height: 70px;
]
[children] => Array
[
]
[nodes] => Array
[
]
I do not understand why the $divs has not simply the code from the div?
Here is an example of the source code at the site:
gute peppige Qualität [17.03.2013]
gute Verarbeitung, schönes Design,
What am I doing wrong?
Finding elements by tag name
// Find all anchors, returns a array of element objects
$ret = $html->find['a'];
// Find all anchors and images, returns an array of element objects
$ret = $html->find['a, img'];
// Find [N]th anchor, returns element object or null if not found [zero based]
$ret = $html->find['a', 0];
// Find last anchor, returns element object or null if not found [zero based]
$ret = $html->find['a', -1];
Finding elements by class name or id
// Find all element which id=foo
$ret = $html->find['#foo'];
// Find all element which class=foo
$ret = $html->find['.foo'];
Finding elements by attribute
// Find all with the id attribute
$ret = $html->find['div[id]'];
// Find all which attribute id=foo
$ret = $html->find['div[id=foo]'];
// Find all anchors and images with the "title" attribute
$ret = $html->find['a[title], img[title]'];
// Find all element has attribute id
$ret = $html->find['*[id]'];
Attribute filters
Supports these operators in attribute selectors:
FilterDescription [attribute]
Matches elements that have the specified attribute.
[!attribute]
Matches elements that don't have the specified attribute.
[attribute=value]
Matches elements that have the specified attribute with a certain value.
[attribute!=value]
Matches elements that don't have the specified attribute with a certain value.
[attribute^=value]
Matches elements that have the specified attribute and it starts with a certain value.
[attribute$=value]
Matches elements that have the specified attribute and it ends with a certain value.
[attribute*=value]
Matches elements that have the specified attribute and it contains a certain value.
Finding descendants
// Find all in
$es = $html->find['ul li'];
// Find Nested tags
$es = $html->find['div div div'];
// Find all in which class=hello
$es = $html->find['table.hello td'];
// Find all td tags with attribite align=center in table tags
$es = $html->find['table td[align=center]'];
Finding nested elements
// Find all - in
foreach[$html->find['ul'] as $ul]
{
foreach[$ul->find['li'] as $li]
{
// do something...
}
}
// Find first - in first
$e = $html->find['ul', 0]->find['li', 0];
Finding text blocks and comments
// Find all text blocks
$es = $html->find['text'];
// Find all comment [] blocks
$es = $html->find['comment'];
Chủ Đề