parse_url

Parse a URL and return its components

Description

intstringarraynullfalse parse_url(string $url, int $component = -1)

This function parses a URL and returns an associative array containing any of the various components of the URL that are present. The values of the array elements are not URL decoded.

This function is not meant to validate the given URL, it only breaks it up into the parts listed below. Partial and invalid URLs are also accepted, parse_url tries its best to parse them correctly.

Caution

This function does not follow any established URI or URL standard. It will return incorrect or non-sense results for relative or malformed URLs. Even for valid URLs the result may differ from that of a different URL parser, since there are multiple different URL-related standards that target different use cases and that differ in their requirements.

Processing an URL with parsers following different URL standards is a common source of security vulnerabilities. As an example, validating an URL against an allow-list of acceptable hostnames with parser A might be ineffective when the actual retrieval of the resource uses parser B that extracts hostnames differently.

The Uri\Rfc3986\Uri and Uri\WhatWg\Url classes strictly follow the RFC 3986 and WHATWG URL Standards respectively. It is strongly recommended to use these classes for all newly written code and to migrate existing uses of the parse_url function to these classes, unless the parse_url behavior needs to be preserved for compatibility reasons.

Parameters

url

The URL to parse.

component

Specify one of PHP_URL_SCHEME, PHP_URL_HOST, PHP_URL_PORT, PHP_URL_USER, PHP_URL_PASS, PHP_URL_PATH, PHP_URL_QUERY or PHP_URL_FRAGMENT to retrieve just a specific URL component as a string (except when PHP_URL_PORT is given, in which case the return value will be an int).

Return Values

On seriously malformed URLs, parse_url may return false.

If the component parameter is omitted, an associative array is returned. At least one element will be present within the array. Potential keys within this array are:

  • scheme - e.g. http
  • host
  • port
  • user
  • pass
  • path
  • query - after the question mark ?
  • fragment - after the hashmark #

If the component parameter is specified, parse_url returns a string (or an int, in the case of PHP_URL_PORT) instead of an array. If the requested component doesn't exist within the given URL, null will be returned. As of PHP 8.0.0, parse_url distinguishes absent and empty queries and fragments:

http://example.com/foo → query = null, fragment = null
http://example.com/foo? → query = "",   fragment = null
http://example.com/foo# → query = null, fragment = ""
http://example.com/foo?# → query = "",   fragment = ""

Previously all cases resulted in query and fragment being null.

Note that control characters (cf. ctype_cntrl) in the components are replaced with underscores (_).

Changelog

Version Description
8.0.0 parse_url will now distinguish absent and empty queries and fragments.

Examples

Example #1 A parse_url example

<?php
$url = 'http://username:password@hostname:9090/path?arg=value#anchor';

var_dump(parse_url($url));
var_dump(parse_url($url, PHP_URL_SCHEME));
var_dump(parse_url($url, PHP_URL_USER));
var_dump(parse_url($url, PHP_URL_PASS));
var_dump(parse_url($url, PHP_URL_HOST));
var_dump(parse_url($url, PHP_URL_PORT));
var_dump(parse_url($url, PHP_URL_PATH));
var_dump(parse_url($url, PHP_URL_QUERY));
var_dump(parse_url($url, PHP_URL_FRAGMENT));
?>

The above example will output:

array(8) {
  ["scheme"]=>
  string(4) "http"
  ["host"]=>
  string(8) "hostname"
  ["port"]=>
  int(9090)
  ["user"]=>
  string(8) "username"
  ["pass"]=>
  string(8) "password"
  ["path"]=>
  string(5) "/path"
  ["query"]=>
  string(9) "arg=value"
  ["fragment"]=>
  string(6) "anchor"
}
string(4) "http"
string(8) "username"
string(8) "password"
string(8) "hostname"
int(9090)
string(5) "/path"
string(9) "arg=value"
string(6) "anchor"

Example #2 A parse_url example with missing scheme

<?php
$url = '//www.example.com/path?googleguy=googley';

// Prior to 5.4.7 this would show the path as "//www.example.com/path"
var_dump(parse_url($url));
?>

The above example will output:

array(3) {
  ["host"]=>
  string(15) "www.example.com"
  ["path"]=>
  string(5) "/path"
  ["query"]=>
  string(17) "googleguy=googley"
}

Notes

Note:

This function is intended specifically for the purpose of parsing URLs and not URIs. However, to comply with PHP's backwards compatibility requirements it makes an exception for the file:// scheme where triple slashes (file:///...) are allowed. For any other scheme this is invalid.

See Also

  • pathinfo
  • parse_str
  • http_build_query
  • dirname
  • basename
  • » RFC 3986