Find All Links on a Page

Here's the basic principal behind spiders.

$html = file_get_contents('');

$dom = new DOMDocument();

// grab all the on the page
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");

for ($i = 0; $i < $hrefs->length; $i++) {
       $href = $hrefs->item($i);
       $url = $href->getAttribute('href');
       echo $url.'<br />';


  1. Oleg
    Permalink to comment#

    Exactly what I needed. Thanks.

  2. Jens Törnell
    Permalink to comment#

    Perfect for affiliate sites!

  3. RONIT


  4. daniel
    Permalink to comment#

    I didnt understand quite how to use this? where to I type that? I’m kinda confused.. I need more explanation


  5. lande
    Permalink to comment#

    This post is just too good. thumbs up!!
    keep up the good work ;)

  6. Dario
    Permalink to comment#

    Muchas gracias por la ayuda!

  7. Zbigniew
    Permalink to comment#

    Works perfect! thx!

  8. juan
    Permalink to comment#

    Can someone please show me step by step in how to use this. Thank you in advance

  9. kazi tanvir ahsan
    Permalink to comment#

    perfect.Was using php simple DOM but not good enough like this.!

  10. shail.dw
    Permalink to comment#

    The unique power of PHP and DOM unleashed. cURL and REGEX based techinques can never match this. Though they have their own uses, ofcourse. Many thanx.

  11. Milan
    Permalink to comment#

    how to follow all other children pages ?

  12. Zen
    Permalink to comment#


    Whats about performance on xPath?

  13. obliviga
    Permalink to comment#

    This is amazing. Thank you so much.

  14. Lorenzo
    Permalink to comment#

    Thanks, very simple. Great!

  15. Sif Eddine
    Permalink to comment#

    Hi, tnx it’s very helpful yet I have a question,
    what if I have to get a link with a specific class
    wil this do it? : (html/body//a.class)

  16. alexander
    Permalink to comment#

    Is there a curl version of this?
    I’ll be appreciate that if anyone write it with curl.

Leave a Comment

Posting Code

We highly encourage you to post problematic HTML/CSS/JavaScript over on CodePen and include the link in your post. It's much easier to see, understand, and help with when you do that.

Markdown is supported, so you can write inline code like `<div>this</div>` or multiline blocks of code in triple backtick fences like this:

  function example() {
    element.innerHTML = "<div>code</div>";

We have a pretty good* newsletter.