PwnDizzle: How to Bypass Facebook's Text Captcha

Thursday, 10 July 2014

How to Bypass Facebook's Text Captcha

In this post I'll discuss Facebook's text captcha and how to bypass it with a little Gimp-Fu image cleaning and Tesseract OCR. The techniques below build on previous work where I demonstrated how to bypass Bugcrowd's captcha.

The Facebook Captcha(s)

I've seen Facebook use two captchas. The first is the friend photo captcha, where you are required to select your friends in pictures. This one seemed hard to bypass (except when you attack your friend's account and know all of their friends).

The second type is the text-based captcha, where you just enter the letters/numbers shown in the image. Something like this:

Let's look at some ways to bypass the text captcha :)

A couple of logic flaws...

My original aim was to focus on OCR with Tesseract but it turns out the captcha had logic flaws as well.

Issue #1 - When entering the captcha not all of the characters needed to be correct. If you got one character wrong it would still be accepted.

Issue #2 - The captcha check is case insensitive. Despite using uppercase and lowercase letters in the captcha images, the server didn't actually verify the case of user input.

Issue #3 - Captcha repetition...

Each captcha should have contained a dynamically generated string randomly chosen from a pool of 62^7 possibilities. For some reason though I encountered repetition. This is obviously very bad as with a limited set of captchas an attacker can just download every image, solve them all and achieve a 100% bypass rate in the future. I have no idea what the cause of this issue was and Facebook didn't release any details.

The logic flaws were interesting but let's not forget OCR as well!

Back to the image...

Let's take a look at a Facebook captcha image:

When thinking about OCR analysis there's some things to note:

Letters/numbers themselves are clearly displayed in black - Good
Minimal overlaying, wiggling and distortion is used - Good
Black scribbles add noise to the background - Bad
White scribbles effectively remove pixels from the characters - Bad

I did some testing with Tesseract and found noise, image size, character size and spacing all had a big impact on the accuracy of results. For example, directly analysing the image above will return invalid characters or no response at all. To improve Tesseract results I needed some way to get rid of the noise and repair damaged characters.

Step #1 Cleaning

I chose to use Gimp for my image cleaning as it was a program I was familiar with and it offered command line processing with Python. While the documentation (here and here) and debugging aren't too good, it gets the job done.

So first up I loaded the image and increased its size, I found processing a smaller image was less accurate and would reduce the quality of the final image.

#Load image
image = pdb.gimp_file_load(file, file)
drawable = pdb.gimp_image_get_active_layer(image)
#Double image size
pdb.gimp_image_scale(image,560,142)

Next I removed the background noise. By selecting by black and then shrinking the selection, the thin black lines would be unselected, leaving just the black letters. To actually paint over the noise I just had to re-grow my selection, invert and paint white.

#Select by color black
pdb.gimp_by_color_select(drawable,"#000000",20,2,0,0,0,0)
#Shrink selection by 1 pixel
pdb.gimp_selection_shrink(image,1)
#Grow selection by 2 pixels
pdb.gimp_selection_grow(image,2)
#Fill black
pdb.gimp_context_set_foreground((0,0,0))
pdb.gimp_edit_fill(drawable,0)
pdb.gimp_edit_fill(drawable,0)
pdb.gimp_edit_fill(drawable,0)
#Invert selection
pdb.gimp_selection_invert(image)
#Fill white
pdb.gimp_context_set_foreground((255,255,255))
pdb.gimp_edit_fill(drawable,0)

With the outside black noise removed I inverted again to reselect the letters/numbers then translated up and down, painting after each translation. This helped fill in the white lines that in general streaked horizontally through the black characters.

#Invert selection
pdb.gimp_selection_invert(image)
pdb.gimp_context_set_foreground((0,0,0))
#Translate selection up 4 pixels and paint
pdb.gimp_selection_translate(image,0,4)
pdb.gimp_edit_fill(drawable,0)
#Translate selection down 10 pixels and paint
pdb.gimp_selection_translate(image,0,-10)
pdb.gimp_edit_fill(drawable,0)

With the processing done I resized the image back to its original size and saved it.

#Resize image
pdb.gimp_image_scale(image,280,71)
#Export
pdb.gimp_file_save(image, drawable, file, file)
pdb.gimp_image_delete(image)

I've included the full script at the bottom of this post. I ran it with the following command:

gimp-console-2.8.exe -i -b "(python-clean RUN-NONINTERACTIVE \"test.png\")" -b "(gimp-quit 0)"

As an example, cleaning the image above I got this:

Step #2 Submitting to Tesseract

With the image now cleaned it was ready for Tesseract. To improve the accuracy of results I selected the single word mode (-psm 8) and used a custom character set (nobatch fb).

tesseract.exe test.jpg output -psm 8 nobatch fb

I created the fb character set in "C:\Program Files (x86)\Tesseract-OCR\tessdata\configs", it contained the following whitelist:

tessedit_char_whitelist abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890

Step #3 Automate everything with Python

I didn't bother to build a fully working POC to automate a real attack - I'm leaving this step as homework for you guys, best script wins $1 via Paypal ;) (I am of course joking don't actually do this!)

Theoretically though if you did want to build a fully functioning script you'd just need to take the python script from my Bugcrowd post and cleaning script from this post, combine and pwn.

Also the following can be used to download Facebook captchas after you have triggered the Facebook defenses:

from urllib.error import *
from urllib.request import *
from urllib.parse import *
import re
import subprocess

def getpage():
    try:
        print("[+] POSTing to fb");
        params = {'lsd':'AVrQ4y7A', 'email':'09262073366', 'did_submit':'Search', '__user':'0', '__a':'1', '__dyn':'7wiUdp87ebG58mBWo', '__req':'p','__rev':'1114696','captcha_persist_data':'abc','recaptcha_challenge_field':'','captcha_response':'abc','confirmed':'1'}
        data = urlencode(params).encode('utf-8')
        request = Request("https://www.facebook.com/ajax/login/help/identify.php?ctx=recover")
        request.add_header('Cookie', 'locale=en_GB;datr=Ku2xUhSA3kShtkMud0JXRHCY; reg_fb_gate=https%3A%2F%2Fwww.facebook.com%2F%3Fstype%3Dlo%26jlou%3DAfco_1iUuf5XPNAuu9SBYhFnEoJfgxIw_9vwHlTfaTRjGB2Ac4VOSLHb018RjcLg3JVRsiY-sQlRSM00X59eKhLh5SJGHltQ0hEQ2WAiRR9A_g%26smuh%3D28853%26lh%3DAc-vs8zSU-_-6kh2%26aik%3Dqh9ABV52OPB3zXxCyUTNXw;')
        #Send request and analyse response
        f = urlopen(request, data)
        response = f.read().decode('utf-8')
        global ccode
        ccode = re.findall('[a-z0-9-]{43}', response)
        global chash
        chash = re.findall('[a-zA-Z0-9_-]{814}', response)
        print("[+] Parsed response");
    except URLError as e:
        print ("*****Error: Cannot retrieve URL*****");

def getcaptcha(i):
    try:
        print("[+] Downloading Captcha");
        captchaurl = "https://www.facebook.com/captcha/tfbimage.php?captcha_challenge_code="+ccode[0]+"&captcha_challenge_hash="+chash[1]
        urlretrieve(captchaurl,'fbcap'+str(i)+'.png')
    except URLError as e:
        print ("*****Error: Cannot retrieve URL*****");

print("[+] Start!");
for i in range(0, 1000):
    #Download page and parse data
    getpage();
    #Download captcha image
    getcaptcha(i);
print("[+] Finished!");

Final Results

So I guess you're wondering, how accurate was Tesseract? Well on a sample of 50 captchas that had been cleaned with Gimp, Tesseract was able to analyse them 100% correctly about 20% of the time. However taking into account the logic flaws the actual pass rate jumped to 50%.

Some example results:

It's quite impressive seeing how well both the Gimp cleaning and Tesseract analysis performed. Although you can also see how even subtle changes in the initial image can significantly affect both cleaning output and final analysis.

Facebook Fix #1

After reporting these issues the captcha repetition was addressed pretty quickly. The other logic flaws were left unchanged. The image itself was modified to make the characters/noise thicker:

Unfortunately this had little effect on the captcha strength as it's the noise to character relative thickness that mattered not the absolute thickness. Making the noise thicker and characters thinner, would have prevented noise removal through selection shrinking.

Final Thoughts

Another day, another captcha bypass. Whether you use Tesseract or a bad-ass custom neural network like Google or Vicarious, text captchas can be bypassed with relative ease. I managed a 20% pass-rate, I'm sure with a better cleaning process and/or Tesseract training this could be pushed a lot higher. It's time to ditch that text captcha.

Facebook said that right now the captcha is used more as a mechanism to slow down attacks as opposed to stopping attacks completely. The captcha will eventually be fixed but there are no plans at the moment.

Shout out to Facebook security for their help looking into this issue. Thanks for reading. Questions and comments are always appreciated, just leave a message below.

Pwndizzle out

############################################
#Gimp-Fu cleaning script, based on stackoverflow script here:
#http://stackoverflow.com/questions/12662676/writing-a-gimp-python-script?rq=1

from gimpfu import pdb, main, register, PF_STRING

def clean(file):
#Load image
image = pdb.gimp_file_load(file, file)
drawable = pdb.gimp_image_get_active_layer(image)
#Double image size
pdb.gimp_image_scale(image,560,142)
#Select by color black
pdb.gimp_by_color_select(drawable,"#000000",20,2,0,0,0,0)
#Shrink selection by 1 pixel
pdb.gimp_selection_shrink(image,1)
#Grow selection by 2 pixels
pdb.gimp_selection_grow(image,2)
#Fill black
pdb.gimp_context_set_foreground((0,0,0))
pdb.gimp_edit_fill(drawable,0)
pdb.gimp_edit_fill(drawable,0)
pdb.gimp_edit_fill(drawable,0)
#Invert selection
pdb.gimp_selection_invert(image)
#Fill white
pdb.gimp_context_set_foreground((255,255,255))
pdb.gimp_edit_fill(drawable,0)
#Invert selection
pdb.gimp_selection_invert(image)
pdb.gimp_context_set_foreground((0,0,0))
#Translate selection up 4 pixels and paint
pdb.gimp_selection_translate(image,0,4)
pdb.gimp_edit_fill(drawable,0)
#Translate selection down 10 pixels and paint
pdb.gimp_selection_translate(image,0,-10)
pdb.gimp_edit_fill(drawable,0)
#Resize image
pdb.gimp_image_scale(image,280,71)
#Export
pdb.gimp_file_save(image, drawable, file, file)
pdb.gimp_image_delete(image)

args = [(PF_STRING, 'file', 'GlobPattern', '*.*')]
register('python-clean', '', '', '', '', '', '', '', args, [], clean)

main()

############################################

35 comments:

Pwntoken12 July 2014 at 19:45
this is a great deduction.
ReplyDelete
Replies
Unknown21 October 2014 at 04:10
Hi,

Your gimp script is no more remove all the disturbance from the image. I am creating facebook captha reader. Would you help me in this?
ReplyDelete
Replies
Emma Watson27 February 2015 at 05:15
Your gimp script is no more remove all the disturbance from the image. I am creating facebook captha reader. Would you help me in this?
Reply
ReplyDelete
Replies
PwnDizzle27 February 2015 at 19:33
Unfortunately I'm busy with other work right now. Don't give up though, filtering should be able to help with a lot of different captcha variations :)
ReplyDelete
Replies
mayazoe28 February 2015 at 11:54
I should say only that its awesome! The blog is informational and always produce amazing things.
facebook
ReplyDelete
Replies
D Ashwin7 January 2017 at 08:24
great ! Thanx for sharing.
ReplyDelete
Replies
sasha1 December 2020 at 09:20
That's some nice trick. Hope it helps some one.
mcafee.com/activate
Unleash the Future
ReplyDelete
Replies
searchkarlo30 January 2021 at 07:03
mcafee virus scan problems

why won t mcafee scan my computer

mcafee livesafe not scanning

mcafee scan not responding

mcafee antivirus scan not working

mcafee total protection won't open

mcafee real time scanning not working
ReplyDelete
Replies
Anonymous17 July 2021 at 09:16
I really appreciate the information that you have shared on your Blog. Thanks for shearing this blog.
Norton.com/setup
ReplyDelete
Replies
medaivibestv15 October 2021 at 00:13
Thanks for this wonderful and vital informatio
that was posted. However,
reading your content was nice because it's has ease flesch reading and
will arranged with easy solution.
ReplyDelete
Replies
Waseem Abbas2 December 2021 at 14:51
Corel Draw x7 Full Crack 2022 With Activation Code Full Version Download CLICK HERE TO DOWNLOAD Corel Draw x7 Full Crack is a vector-based definitely photographs editor software program application utility that’s used to create photographs, logos, invitation playing gambling playing cards in addition to flexes. The interface of this photo-improving software program application utility is straightforward and man or woman-friendly.

vMix Crack Software is a video mixing and switching software that takes advantage of the latest features in hardware to enable live HD video mixing. Moreover, a task previously only possible on dedicated and expensive hardware mixers. In addition, This software is an entire stay video manufacturing software program answer with capabilities inclusive of LIVE blending, switching, recording, and LIVE streaming of SD, complete HD,

EDRAW MAX is a helpful flowchart plan application that permits you to envision your thoughts! With this Professional tool, clients, for example, understudies, instructors, and business professionals can dependably make and distribute different sorts of outlines to speak to thought Free formats! It empowers understudies to make and distribute an assortment of graphs for understudies, educators, and business visionaries to certainly speak to any stunning thoughts
ReplyDelete
Replies
Unknown2 December 2021 at 15:11
Corel Draw x7 Full Crack 2022 With Activation Code Full Version Download CLICK HERE TO DOWNLOAD Corel Draw x7 Full Crack is a vector-based definitely photographs editor software program application utility that’s used to create photographs, logos, invitation playing gambling playing cards in addition to flexes. The interface of this photo-improving software program application utility is straightforward and man or woman-friendly.

AV Voice Changer Software Diamond Patch is beneficial for customers who need to be the Voice Master of Media in cyberspace. They can use it to have amusing whilst chatting the usage of immediately messenger programs, do voice dubbing and voice-overs for his or her very own video/audio clips, mimic the voice in their preferred Idol.

Euro Truck Simulator Crack With Keygen comes as a single-participant without qualification mode to apply the excellent preview in your paintings and jogging. This recreation became launched in 2012 and regarded round similarly. If you’ve got quite a few records approximately the sorts of equipment and units of European and European paintings. Euro Truck Simulator three Activation Key List Download.
ReplyDelete
Replies
Easy Loan Mart13 December 2021 at 05:26
Hi....
Simple CAPTCHAs can be bypassed using the Optical Character Recognition (OCR) technology that recognizes the text inside images, such as scanned documents and photographs. This technology converts images containing written text into machine-readable text data.
You are also read more Online Business Loan in India
ReplyDelete
Replies
Link Roku Activation28 December 2021 at 17:44
Thanks for sharing this information here. It seems really very informative.

tv.youtube.com/start/roku | roku com link | roku.com/link
ReplyDelete
Replies
Link Roku Activation29 December 2021 at 18:15
Getting worried about your Roku activation issue? Talk to our experts to activate your Roku through the live chat process. Our team of experts is 24/7 available to help you. Roku link code is nothing but the Roku activation code that you need to feed in Roku.com/link. Get in touch with us for more information.
tv.youtube.com/start/roku | roku com link
ReplyDelete
Replies
periyannan31 December 2021 at 10:55
e blog
internship for web development | internship in electrical engineering | mini project topics for it 3rd year | online internship with certificate | final year project for cse
ReplyDelete
Replies
Anonymous12 January 2022 at 14:47
テキサス・ホールデムポーカーのルールと攻略法 - カジノ日本
ReplyDelete
Replies
Anonymous13 January 2022 at 12:41
I think this post is very informative and helpful. I have to add this to my collection. You did a great job! Very good article. Based on your previous post Improve your aim, I also wrote an in-depth article. You may be interested in reading this article click counter. Thanks for visiting.
ReplyDelete
Replies
Amazingmarketingco13 March 2022 at 05:40
While looking for playing outdoors these days, it was very difficult as most of the children are addicted to smartphones. Addiction to smartphones has come like a big headache for the parents, they need to find a way to grab the childrens attention away from phones and have a healthy habits which benefits their physical and mental health. This is where kids ride on cars has become one of the popular toys among the kids to play and have fun.
The kids electric cars will help to come out and play outdoors due to their look alike real cars making the children more enthusiastic and fascinated to play with those toys. The parents can operate the electric rideons with the help of a remote control and hence the children can enjoy the ride while they are thinking like a real riders sitting in the car and having lots of joy and happiness. However, these toys comes in a bit pricy compared to normal toys which are under a 30 dollars price, while these toys costs close to 100 to 200 pounds in general based on the model and specifications of the car. Check here for best rideons.
While there are high end models like off-road electric cars which are for big kids and come with a battery of 24V making them more powerful and fits perfectly to ride for the age above 8 years old. Thus these kinds of cars are helping the children to move away from the phones and enjoy the outdoors which will help for physical exercise and improve their health as well.
Also there are Licensed kids electric cars in the segments where the big brands like BMW, Ford, Audi, Lamborghini, Mercedes etc type real world cars are being made in a tiny cars which attracts children so much towards these little rideons. Especially girl child can prefer pink color lamborghini cars if they are fascinated about the sports cars and the boys can choose whatever car model they have interest in. Thus you can purchase a good rideon car for your kid and improve their Joy further.
ReplyDelete
Replies
Tradingyug24 August 2022 at 11:34
Forex Factory Calendar can be described as the best convenient and accurate calendar that keeps the track of news related to Forex. After this guide, you’ll be able to utilize the calendar and how interpret it in a manner that can benefit your trading.
ReplyDelete
Replies
Eszopicloenpharmacy15 October 2022 at 11:32
I found this is an informative blog and also very useful and knowledgeable and i have also more useful links
buy ambien online
buy modalert 200mg online
buy ambien online
buy ambien online
buy ambien online
buy tapentadol 100mg online
ReplyDelete
Replies
E commanager20 November 2022 at 19:24
Amazon Specialist having experties in buisnes growth and ow to scale a brand on amazon
ReplyDelete
Replies
Nila dharshan3 January 2023 at 11:55
Great blog, it is very impressive.

Clinical SAS Course in Chennai
Clinical SAS Online Course

ReplyDelete
Replies
JacobHarman19 January 2023 at 17:27
This comment has been removed by the author.
ReplyDelete
Replies
Business Leads World28 March 2023 at 21:27
Business Loan Leads & MCA Leads or Merchant cash advance brokers generate the leads and loans for customers who require Capital and can’t reach the banks. We made the work simple and smooth for you and our customers by providing Real-Time Merchant Cash Advance Leads. With our connected ownership involvement and the marketing products used within our businesses, we use innovative methods to reserve the attention of Small Business Owners across the United States. Visit Our Website MCA Live Transfer Leads
ReplyDelete
Replies
Anna Buckley13 April 2023 at 11:11
私はいつもあなたのブログ投稿を楽しく読んでいます。複雑なトピックに命を吹き込む方法に感謝しています。ヒッティングゲームについてのプロフィールを紹介したいと思います。このゲームでは、クリックの背後にある科学と、それが私たちの脳と体にどのように影響するかを探っています。クリックの速さをより深く理解できる、魅力的な読み物です。
ReplyDelete
Replies
englishbhai22 May 2023 at 11:26
"Hello,c.WE PROVIDE Logo design services online

My expertise lies in conducting comprehensive website audits, keyword research, on-page optimization, content development, link building, and tracking performance metrics to ensure continuous improvement. I stay updated with the latest trends and algorithms in search engine optimization, enabling me to develop effective strategies that align with search engine guidelines and deliver long-term results.srNaW?w2#eZrpSs

Having worked with diverse clients across various industries, I have a proven track record of increasing organic traffic, boosting conversion rates, and maximizing ROI. I believe in a holistic approach to SEO that combines technical optimization, content relevance, and user experience to create a strong online presence for my clients.

As an SEO expert, I am skilled in utilizing industry-leading tools and analytics platforms to gather insights, analyze data, and make data-driven decisions. I am also proficient in implementing SEO best practices across various content management systems and staying up to date with the latest trends in search engine marketing.

I am passionate about collaborating with businesses, understanding their unique goals, and tailoring SEO strategies that deliver tangible results. By continuously monitoring and adapting to changes in search engine algorithms, I strive to provide my clients with a competitive edge in the digital landscape.

Please note that you can modify and personalize this bio according to your specific expertise, experience, and achievements.
ReplyDelete
Replies
Prisha- Blog7 November 2023 at 04:23
Thanks for sharing this helpful and informative article. Online six sigma course
ReplyDelete
Replies
Sreehari Satheesh31 January 2025 at 11:08
best place to study degital marketing
ReplyDelete
Replies
Akhila-thilakan6 February 2025 at 09:27
best digital marketing institution
ReplyDelete
Replies
Jeramy S Hick7 November 2025 at 09:56
I’m grateful for how smoothly you presented the information, it was well-written. Read this profile to explore more Coreball Game. Coreball Unblocked runs directly in browsers, with no downloads required.
ReplyDelete
Replies
Rosemarie Cozart24 November 2025 at 06:40
I’m grateful for your detailed insights; they added real knowledge. Read this article CPS Test. The Click Test helps determine how consistently users can maintain high clicking speed.
ReplyDelete
Replies
Nikhil23 January 2026 at 10:22
A Python online training course offers remote coding education.It focuses on hands-on practice.This Python online training course increases flexibility.It is dependable.
ReplyDelete
Replies
education19 February 2026 at 18:24
⭐ Data Analyst Course
A professional data analyst course helps you master Excel, SQL, Python, and Power BI.
The curriculum is designed according to industry standards.
Hands-on projects provide real-time data experience.
Expert trainers guide you step by step.
Assignments strengthen analytical thinking skills.
This course prepares you for high-demand data analyst roles.
ReplyDelete
Replies
vr21 February 2026 at 09:08
Insightful share! The ui ux design course
combines theory and hands-on projects to help learners craft seamless, engaging digital experiences.
ReplyDelete
Replies

Add comment