PHP Classes

Simple HTML DOM: Manipulate HTML elements using DOMDocument

Recommend this page to a friend!
     
  Info   Example   View files Files   Install with Composer Install with Composer   Download Download   Reputation   Support forum   Blog    
Ratings Unique User Downloads Download Rankings
StarStarStarStar 75%Total: 488 All time: 5,892 This week: 488Up
Version License PHP version Categories
voku-simple_html_dom 2.0.78MIT/X Consortium ...5.3HTML, PHP 5, Parsers
Collaborate with this project 

Authors

dimabdc
Lars Moelleken


Contributor

simple_html_dom - github.com

Description

This class can manipulate HTML elements using DOMDocument

This is a fork of SimpleHTMLDOM package that uses DOMDocument classes instead of HTML string manipulation.

It can parse and tolerate invalid HTML and supports UTF-8 documents

It can search tags on a HTML page with selectors just like jQuery.

Picture of Lars Moelleken
  Performance   Level  
Name: Lars Moelleken <contact>
Classes: 25 packages by
Country: Germany Germany
Age: 36
All time rank: 62140 in Germany Germany
Week rank: 163 Up7 in Germany Germany Up
Innovation award
Innovation award
Nominee: 11x

Winner: 1x

Example

<?php

use voku\helper\HtmlDomParser;

require_once
'../vendor/autoload.php';

// -----------------------------------------------------------------------------
// descendant selector
$str = <<<HTML
<div>
    <div>
        <div class="foo bar">ok</div>
    </div>
</div>
HTML;

$html = HtmlDomParser::str_get_html($str);
echo
$html->find('div div div', 0)->innertext . '<br>'; // result: "ok"

// -----------------------------------------------------------------------------
// nested selector
$str = <<<HTML
<ul id="ul1">
    <li>item:<span>1</span></li>
    <li>item:<span>2</span></li>
</ul>
<ul id="ul2">
    <li>item:<span>3</span></li>
    <li>item:<span>4</span></li>
</ul>
HTML;

$html = HtmlDomParser::str_get_html($str);
foreach (
$html->find('ul') as $ul) {
    foreach (
$ul->find('li') as $li) {
        echo
$li->innertext . '<br>';
    }
}

// -----------------------------------------------------------------------------
// parsing checkbox
$str = <<<HTML
<form name="form1" method="post" action="">
    <input type="checkbox" name="checkbox1" value="checkbox1" checked>item1<br>
    <input type="checkbox" name="checkbox2" value="checkbox2">item2<br>
    <input type="checkbox" name="checkbox3" value="checkbox3" checked>item3<br>
</form>
HTML;

$html = HtmlDomParser::str_get_html($str);
foreach (
$html->find('input[type=checkbox]') as $checkbox) {
    if (
$checkbox->checked) {
        echo
$checkbox->name . ' is checked<br>';
    } else {
        echo
$checkbox->name . ' is not checked<br>';
    }
}


Details

Build Status Coverage Status Codacy Badge Latest Stable Version Total Downloads License Donate to this project using Paypal Donate to this project using Patreon

:scroll: Simple Html Dom Parser for PHP

A HTML DOM parser written in PHP - let you manipulate HTML in a very easy way! This is a fork of PHP Simple HTML DOM Parser project but instead of string manipulation we use DOMDocument and modern php classes like "Symfony CssSelector".

  • PHP 7.0+ & 8.0 Support
  • PHP-FIG Standard
  • Composer & PSR-4 support
  • PHPUnit testing via Travis CI
  • PHP-Quality testing via SensioLabsInsight
  • UTF-8 Support (more support via "voku/portable-utf8")
  • Invalid HTML Support (partly ...)
  • Find tags on an HTML page with selectors just like jQuery
  • Extract contents from HTML in a single line

Install via "composer require"

composer require voku/simple_html_dom
composer require voku/portable-utf8 # if you need e.g. UTF-8 fixed output

Quick Start

use voku\helper\HtmlDomParser;

require_once 'composer/autoload.php';

...
$dom = HtmlDomParser::str_get_html($str);
// or 
$dom = HtmlDomParser::file_get_html($file);

$element = $dom->findOne('#css-selector'); // "$element" === instance of "SimpleHtmlDomInterface"

$elements = $dom->findMulti('.css-selector'); // "$elements" === instance of SimpleHtmlDomNodeInterface<int, SimpleHtmlDomInterface>

$elementOrFalse = $dom->findOneOrFalse('#css-selector'); // "$elementOrFalse" === instance of "SimpleHtmlDomInterface" or false

$elementsOrFalse = $dom->findMultiOrFalse('.css-selector'); // "$elementsOrFalse" === instance of SimpleHtmlDomNodeInterface<int, SimpleHtmlDomInterface> or false
...

Examples

github.com/voku/simple_html_dom/tree/master/example

API

github.com/voku/simple_html_dom/tree/master/README_API.md

Support

For support and donations please visit Github | Issues | PayPal | Patreon.

For status updates and release announcements please visit Releases | Twitter | Patreon.

For professional support please contact me.

Thanks

  • Thanks to GitHub (Microsoft) for hosting the code and a good infrastructure including Issues-Managment, etc.
  • Thanks to IntelliJ as they make the best IDEs for PHP and they gave me an open source license for PhpStorm!
  • Thanks to Travis CI for being the most awesome, easiest continous integration tool out there!
  • Thanks to StyleCI for the simple but powerfull code style check.
  • Thanks to PHPStan && Psalm for relly great Static analysis tools and for discover bugs in the code!

License

FOSSA Status


  Files folder image Files (89)  
File Role Description
Files folder image.github (3 files, 1 directory)
Files folder imagebuild (2 files, 1 directory)
Files folder imageexample (16 files)
Files folder imagesrc (1 directory)
Files folder imagetests (12 files, 1 directory)
Accessible without login Plain text file .editorconfig Data Auxiliary data
Accessible without login Plain text file .scrutinizer.yml Data Auxiliary data
Accessible without login Plain text file .styleci.yml Data Auxiliary data
Accessible without login Plain text file .travis.yml Data Auxiliary data
Accessible without login Plain text file CHANGELOG Data Auxiliary data
Accessible without login Plain text file composer.json Data Auxiliary data
Accessible without login Plain text file LICENSE Lic. License text
Accessible without login Plain text file phpcs.php_cs Example Example script
Accessible without login Plain text file phpstan.neon Data Auxiliary data
Accessible without login Plain text file phpunit.xml Data Auxiliary data
Accessible without login Plain text file README.md Doc. Documentation
Accessible without login Plain text file README_API.md Doc. Documentation

  Files folder image Files (89)  /  .github  
File Role Description
Files folder imageworkflows (1 file)
  Accessible without login Plain text file CONTRIBUTING.md Data Auxiliary data
  Accessible without login Plain text file FUNDING.yml Data Auxiliary data
  Accessible without login Plain text file ISSUE_TEMPLATE.md Data Auxiliary data

  Files folder image Files (89)  /  .github  /  workflows  
File Role Description
  Accessible without login Plain text file ci.yml Data Auxiliary data

  Files folder image Files (89)  /  build  
File Role Description
Files folder imagedocs (1 file)
  Accessible without login Plain text file composer.json Data Auxiliary data
  Plain text file generate_docs.php Class Class source

  Files folder image Files (89)  /  build  /  docs  
File Role Description
  Accessible without login Plain text file api.md Data Auxiliary data

  Files folder image Files (89)  /  example  
File Role Description
  Accessible without login Plain text file example_add_content.php Example Example script
  Accessible without login Plain text file example_advanced_selector.php Example Example script
  Accessible without login Plain text file example_basic_selector.php Example Example script
  Accessible without login Plain text file example_extract_data_attribute.php Example Example script
  Accessible without login Plain text file example_extract_html.php Example Example script
  Accessible without login Plain text file example_extract_meta_tags.php Example Example script
  Accessible without login Plain text file example_find_image_if_exists.php Example Example script
  Accessible without login Plain text file example_find_text.php Example Example script
  Accessible without login Plain text file example_modify_attribute.php Example Example script
  Accessible without login Plain text file example_modify_contents.php Example Example script
  Accessible without login Plain text file example_modify_styles_with_svg.php Example Example script
  Accessible without login Plain text file example_remove_comments.php Example Example script
  Accessible without login Plain text file example_remove_content.php Example Example script
  Accessible without login Plain text file example_remove_content_from_table.php Example Example script
  Accessible without login Plain text file example_scraping_imdb.php Example Example script
  Accessible without login Plain text file example_scraping_lebensmittelwarnung.php Example Example script

  Files folder image Files (89)  /  src  
File Role Description
Files folder imagevoku (1 directory)

  Files folder image Files (89)  /  src  /  voku  
File Role Description
Files folder imagehelper (24 files)

  Files folder image Files (89)  /  src  /  voku  /  helper  
File Role Description
  Plain text file AbstractDomParser.php Class Class source
  Plain text file AbstractSimpleHtmlDom.php Class Class source
  Plain text file AbstractSimpleHtmlDomNode.php Class Class source
  Plain text file AbstractSimpleXmlDom.php Class Class source
  Plain text file AbstractSimpleXmlDomNode.php Class Class source
  Plain text file DomParserInterface.php Class Class source
  Plain text file HtmlDomHelper.php Class Class source
  Plain text file HtmlDomParser.php Class Class source
  Plain text file SelectorConverter.php Class Class source
  Plain text file SimpleHtmlAttributes.php Class Class source
  Plain text file SimpleHtmlAttributesInterface.php Class Class source
  Plain text file SimpleHtmlDom.php Class Class source
  Plain text file SimpleHtmlDomBlank.php Class Class source
  Plain text file SimpleHtmlDomInterface.php Class Class source
  Plain text file SimpleHtmlDomNode.php Class Class source
  Plain text file SimpleHtmlDomNodeBlank.php Class Class source
  Plain text file SimpleHtmlDomNodeInterface.php Class Class source
  Plain text file SimpleXmlDom.php Class Class source
  Plain text file SimpleXmlDomBlank.php Class Class source
  Plain text file SimpleXmlDomInterface.php Class Class source
  Plain text file SimpleXmlDomNode.php Class Class source
  Plain text file SimpleXmlDomNodeBlank.php Class Class source
  Plain text file SimpleXmlDomNodeInterface.php Class Class source
  Plain text file XmlDomParser.php Class Class source

  Files folder image Files (89)  /  tests  
File Role Description
Files folder imagefixtures (18 files)
  Plain text file AuxiliarFunctionsTest.php Class Class source
  Accessible without login Plain text file bootstrap.php Aux. Auxiliary script
  Plain text file CommentTest.php Class Class source
  Plain text file DomManipulationTest.php Class Class source
  Plain text file HTML5DOMDocumentTest.php Class Class source
  Plain text file HtmlDomParserTest.php Class Class source
  Plain text file SimpleHtmlDomMemoryTest.php Class Class source
  Plain text file SimpleHtmlDomNodeTest.php Class Class source
  Plain text file SimpleHtmlDomTest.php Class Class source
  Plain text file SimpleHtmlHelperTest.php Class Class source
  Plain text file TwigTest.php Class Class source
  Plain text file XmlDomParserTest.php Class Class source

  Files folder image Files (89)  /  tests  /  fixtures  
File Role Description
  Accessible without login HTML file big.html Doc. Documentation
  Accessible without login HTML file horrible.html Doc. Documentation
  Accessible without login HTML file issue81.html Doc. Documentation
  Accessible without login HTML file issue81_v2.html Doc. Documentation
  Accessible without login HTML file small.html Doc. Documentation
  Accessible without login HTML file test_mail.html Doc. Documentation
  Accessible without login HTML file test_mail_expected.html Doc. Documentation
  Accessible without login HTML file test_page.html Doc. Documentation
  Accessible without login HTML file test_page_plaintext.html Doc. Documentation
  Accessible without login Plain text file test_template.twig Data Auxiliary data
  Accessible without login HTML file test_template_js.html Doc. Documentation
  Accessible without login Plain text file test_xml.xml Data Auxiliary data
  Accessible without login Plain text file test_xml_complex.xml Data Auxiliary data
  Accessible without login Plain text file test_xml_complex_v2.xml Data Auxiliary data
  Accessible without login Plain text file test_xml_complex_v3.xml Example Example script
  Accessible without login Plain text file test_xml_expected.xml Data Auxiliary data
  Accessible without login Plain text file test_xml_replace_expected.xml Data Auxiliary data
  Accessible without login HTML file windows-1252-example.html Doc. Documentation

The PHP Classes site has supported package installation using the Composer tool since 2013, as you may verify by reading this instructions page.
Install with Composer Install with Composer
Downloadvoku-simple_html_dom-2023-02-12.zip 286KB
Downloadvoku-simple_html_dom-2023-02-12.tar.gz 249KB
Install with ComposerInstall with Composer
Needed packages  
Class DownloadWhy it is needed Dependency
Portable UTF-8 Download .zip .tar.gz Strin Required
 Version Control Reuses Unique User Downloads Download Rankings  
 100%4
Total:488
This week:0
All time:5,892
This week:488Up
User Ratings User Comments (1)
 All time
Utility:93%StarStarStarStarStar
Consistency:100%StarStarStarStarStarStar
Documentation:93%StarStarStarStarStar
Examples:93%StarStarStarStarStar
Tests:-
Videos:-
Overall:75%StarStarStarStar
Rank:71
 
nice
7 years ago (muabshir)
80%StarStarStarStarStar