Google Webmaster Central Blog - Official news on crawling and indexing sites for the Google index

Flash indexing with external resource loading

Thursday, June 18, 2009 at 11:27 PM

Webmaster Level: All

We just added external resource loading to our Flash indexing capabilities. This means that when a SWF file loads content from some other file—whether it's text, HTML, XML, another SWF, etc.—we can index this external content too, and associate it with the parent SWF file and any documents that embed it.
This new capability improves search quality by allowing relevant content contained in external resources to appear in response to users' queries. For example, this result currently comes up in response to the query [2002 VW Transporter 888]:


Prior to this launch, this result did not appear, because all of the relevant content is contained in an XML file loaded by a SWF file.

To date, when Google encounters SWF files on the web, we can:
  • Index textual content displayed as a user interacts with the file. We click buttons and enter input, just like a user would.
  • Discover links within Flash files.
  • Load external resources and associate the content with the parent file.
  • Support common JavaScript techniques for embedding Flash, such as SWFObject and SWFObject2.
  • Index sites scripted with AS1 and AS2, even if the ActionScript is obfuscated. Update on June 19, 2009: We index sites with AS3 as well. The ActionScript version isn't particularly relevant in our Indexing process, so we support older versions of AS in addition to the latest.
If you don't want your SWF file or any of its external resources crawled by search engines, please use an appropriate robots.txt directive.

The comments you read here belong only to the person who posted them. We do, however, reserve the right to remove off-topic comments.

40 comments:

EzyBlogger said...

Great !

Muhammad Yaqoob said...

Thanks for sharing this valuable piece of information.

Gautam said...

That's great news!!

noisycity said...

Hi and great !

Could you send us example of Robots.txt text to block links ?

Stephane.

kevin said...

I notice the clickable links in flash videos such as YouTube embeds don't get crawled. Will this ever be changed? It seems like if someone feels a video is important enough to post the link to the source of that video would be completely relevant and should count as a link.

Tomek said...

How does it work with dynamically loaded XML assets? i.e. Flash loads config.xml file (of which the URL is built into SWF movie) then depending on user interaction it dynamically loads other XML assets whose URL paths are stored in config.xml file.

Would they be indexed as well?

cricket said...

Thats good news. I was not using flash because of this reason in my cricket website http://www.thecricfanclub.com
But I can use it now :)

DataPlus - Custom Data Services said...

I think this is huge for flash.

Dan Mark said...

Great news!
But and about action script 3 ?

Is it not correct interpreted and indexed yet ?

DaLLasISP said...

Hmmm, will the indexing result in the external file being the search result or the swf that is using the external file?

Good question from Tomec about indexing db connection xml or other undisclosed information.

Cheers, 1CallService.com

Paulo Moreira said...

"Index sites scripted with AS1 and AS2"-So what about AS3?

stevemassart said...

Does this work with all version of Flash or just more recent ones?

raminoacid said...

Not much use if AS3 is not supported!

E.S.V. said...

An example of a robots.txt file blocking some .swf and Flash related assets would be as follows:

User-Agent: *
Disallow: /website.swf
Disallow: /data/myData.xml*
Allow: /

In fact, I had to use a similar one a couple of months ago when my xml files gained more pagerank than my very own Flash site.

On the other hand, I didn't like the user accessing directly the swf files. My SWF files need the embedding parameters of the SWFObject in order to work properly. So nobody should get straight to the .swf, or she would think that it's a broken site.

That's why filtering swfs and external data files in robots.txt becomes so important.

Ji said...

Finally! Thks for the Feedback.

Bridging Loans said...

Thanks for the example E.S.V - this is good news! My flash site http://benrandall.co.uk/bridging.aspx might actually start appearing in search results now!!

zedia.net said...

Google is already indexing AS3 content using the Ichabob player. I think the article was implying that it is also indexing AS1 and AS2 content.

Now this is a great news as all the website I build use an external preloader to load my main content. Now that content can be indexed.

Kevin said...

In general this is a positive step for Adobe and Google, but it's still a long way from having a Flash RIA be as penetrable by Google's crawler as HTML, CSS and JavaScript are. We at Socrata previously built our social data discovery application entirely in Flex/Flash, but in part due to the ongoing SEO challenges we rebuilt the site in HTML, CSS and JavaScript. Socrata is a site for people to find public data and datasets, so it was important that the underlying content be indexed by Google and other search engines. Since relaunching the site the SEO improvement has been substantial and was almost immediate.

Jeremy said...

Does it index deep content. Let's say that a user click on a flash link/button which in turn load some XML external content, does it get index?

Also, what happen if in my external xml content there are lot of content which is not meant for user display, would it get indexed?

Maile Ohye said...

@Tomek: Yes, the XML assets in your example could be indexed as well.

Vuzum said...

This is great! Thank you! :D

Eden said...

Great news. Now the matter is to hide unwanted contents from google...

Todd Dominey said...

I'm interested to know more about how it indexes URLs found in an XML file. For example, if a XML file contained URLs for external images, would the paths have to be absolute? Or does the indexing have the capability to follow relative paths using the parent document as the root?

Logan said...

Yes, the robots are crawling as3 now - I can verify that. Here's what I've experienced with Google indexing our flash stuff:

1. don't break apart your text - but it doesn't seem to matter if it's HTML text format or not.

2. keep all your "accessible" text in Movie clips - NOT graphic clips.

3. Google seems to have reduced some of the significance of flash text on home pages - previously Google was returning search results for the content of our swf file - even though that content was minimal on the page, and not directly related to the rest of the site content (client testimonials). Looks like that's been fixed.

3. I would ALWAYS make external links within a flash file absolute, including the http:// - unless you're pulling data back, like an mp3 file, or image. Those I would block with robots.txt

4. I'm sure they're filtering out the config.xml files, since that's included in every swf. But I'll test that out to be sure.

5. links in videos probably won't be crawled if it's in .flv format, like youtube. However, if you have a .swf loading flv's into it - any links in the swf should be recognized. I'm assuming you can "nofollow" those as well.

6. I still have NO intentions on building sites entirely in flash - since you can go ahead and say goodbye to any mobile visitors, which happens to be the biggest growth right now.

7. still, this IS great.

Ammar Mardawi said...

You can prevent google from indexing sensitive strings in your AS code by encrypting the strings either manually, or by using secureSWF's literal strings encryption feature.

Disclosure: I work for Kindisoft.

Sherrlyn Borkgren Photography said...

Wow! I can't wait to see how it changes searches on my photography flash site. I'm so tired of being on page 3-15 on google search


http://www.BorkgrenPhoto.net

Chee Sheng said...

There was this site g2000 and it has some content loading from xml data but it seems that google search didn't return the respective pages... E.g. g2000 ang mo kio hub but the following result was not given.. http://www.g2000.com.sg/ss09/#/store/2/1...

Does that mean that there is a syntax / structure I would need to follow?

Webbpromoter said...

thanks for sharing

Tolomelli said...

What about AMF content?

Garrett said...

Will deep linking with SWFAddress be incorporated

Andrew Blair said...

I knew it was coming but now I can tell my clients it's here, thanks!

paul said...

OK so google has indexed text in our flash photography .swf, which is neat BUT the google link to said indexed content only brings up the .swf file - rather than loading the html file the .swf is embeded in? Is it just me or is this a bad way to do it? As now the .swf file loads oversized in the browser which means it scales up over 100% making the jpegs look awful. Is there something i can do to prevent this?

Janus Klok said...

Do you make webservice calls and index the returning XML as well?

Seanonymous said...

When you say "We click buttons and enter input, just like a user would," is that limited to Flash components? If I create my own drop down menu from scratch with AS and drawing objects, can your robots really access that?

And how do you decide what type of text gets entered? You guys haven't come up with some kind of AI that's eventually going to subjugate us, have you?

Lidali said...

Really cute and all but what if you have a complete site in flash and it can "read" all the external files. What good does that do when one flash-file contains 80 pages and the content is put in an xml-file.
Does it then link to the flash-page? Because that does really no good for searches, then I still don't know which of the 80 pages contains the information I am looking for.

Or am I seeing this all wrong? Anyone who knows?

dsaliberti said...

Great! Relevant! important! :)
Thanx!

Lets index it all.

Index flash as3 developers too ;)
dsaliberti

Wanja said...

Nice one chiefs!

Dan said...
This post has been removed by the author.
Ralphie Dee said...

Let me ask this, this discussion is about indexing files within an swf. I'm looking for an answer for a similar question. If I embed my swf content inside an xhtml file, go to the header add the title of the site and regular keywords wouldnt Google see it?

Vasco said...

I'm also interested about the AMF content? is it indexed too?