URL Meta

FAQs For Websites

Who made it?

I made it. My name is Moin and I'm from Islamabad, Pakistan. I work as a freelance full stack consultant & developer. Here's my website in case you were wondering.

I made this because I had to make crawlers/indexers for most of the jobs I done previously. So I thought why not an API to serve this purpose. Here it is.

What does it extract?

URLMeta.org only extracts the data in head section of a page. Title, Description, Image, Logo and Feed. It looks for RSS, ATOM and oEmbed feed. All of these these information are only extracted and provided if present in the page.

It looks for Open Graph protocol first, then for Twitter card then Schema for image & description found in itemprop attributes.

If no protocol is found then it will fallback to title tag and description attribute of meta tag.

Can it crawl AJAX-based websites too?

No. Not yet. But I'm working on it. It'll use the same method as Google does for AJAX crawling. Google's method and guidlines can be found, here, here and here.

Is this an open-source project?

Yes. But currently, this site is available on Github. API is not open yet. I'm making some changes, refactoring code, adding cache, etc. It will be available on GitHub here once its 'presentable'.

Can I restrict it from crawling my site?

Yes, you can. Simply add <meta name="urlmeta" content="no"> in head section of your website/page.

If you want a more general way then pass a custom header urlmeta with no as its value.

Here are some examples on how to set it up on your server:

Apache .htaccess
Header add urlmeta "no"

NGINX nginx.conf
add_header urlmeta no

I have something else to ask. How do I contact you?

You can fill this form, comment below or just Tweet me and I'll get back to you.

Discuss: