DOMDocument::loadHTMLFile

Load HTML from a file

Description

public bool DOMDocument::loadHTMLFile(string $filename, int $options = 0)

The function parses the HTML document in the file named filename. Unlike loading XML, HTML does not have to be well-formed to load.

Warning

This function parses the input using an HTML 4 parser. The parsing rules of HTML 5, which is what modern web browsers use, are different. Depending on the input this might result in a different DOM structure. Therefore this function cannot be safely used for sanitizing HTML.

The behavior when parsing HTML can depend on the version of libxml that is being used, particularly with regards to edge conditions and error handling. For parsing that conforms to the HTML5 specification, use Dom\HTMLDocument::createFromString or Dom\HTMLDocument::createFromFile, added in PHP 8.4.

As an example, some HTML elements will implicitly close a parent element when encountered. The rules for automatically closing parent elements differ between HTML 4 and HTML 5 and thus the resulting DOM structure that DOMDocument sees might be different from the DOM structure a web browser sees, possibly allowing an attacker to break the resulting HTML.

Parameters

filename

The path to the HTML file.

options

Bitwise OR of the libxml option constants.

Return Values

Returns true on success or false on failure.

Errors/Exceptions

If an empty string is passed as the filename or an empty file is named, a warning will be generated. This warning is not generated by libxml and cannot be handled using libxml's error handling functions.

While malformed HTML should load successfully, this function may generate E_WARNING errors when it encounters bad markup. libxml's error handling functions may be used to handle these errors.

Changelog

Version Description
8.3.0 This function now has a tentative bool return type.
8.0.0 Calling this function statically will now throw an Error. Previously, an E_DEPRECATED was raised.

Examples

Example #1 Creating a Document

<?php
$doc = new DOMDocument();
$doc->loadHTMLFile("filename.html");
echo $doc->saveHTML();
?>

See Also

  • DOMDocument::loadHTML
  • DOMDocument::saveHTML
  • DOMDocument::saveHTMLFile