The Website Change Request Form has been a running topic around here for a little while and I’m gonna run with that for a little while. We are not going to rehash all the HTML and JavaScript that makes the form work, so if you need to catch up, go check out that first article.
What we have at this point is a pretty nice looking form that has a pretty nice user experience to it. I feel like it’s lacking two major things though. A) the notification emails themselves are pretty bland and basic text emails and B) there is almost no security at all on the form itself.
Thanks to Daniel Friedrich, I know have implemented some more serious security into the form and that will be the focus of this article. The two big goals are:
- The form is being submitted by a human being
- That human being isn’t doing anything nefarious

Token Matching
The first thing that we are going to do is generate a “token”, essentially a secret code. This token is going to be part of our “session”, meaning it is stored server side. This token also is going to be applied as a hidden input on the form itself when it is first generated in the browser. That means this token exists both on the client side and the server side and we can match them when the form gets submitted and make sure they are the same. What this does is ensure that any submission of the form is our form and not some third-party script wailing away at us from a different server.
First we’ll need to start a session, then create a function to build the token, apply it to the session, and return it for our use.
session_start();
function generateFormToken($form) {
// generate a token from an unique value
$token = md5(uniqid(microtime(), true));
// Write the generated token to the session variable to check it against the hidden field when the form is sent
$_SESSION[$form.'_token'] = $token;
return $token;
}
The unique value here comes from a md5 hash of the microtime function, but Daniel says there are a slew of other crypting methods (like salt-values…). Now that we have this function, right before we build the form in the markup we can call it to create the value.
<?php
// generate a new token for the $_SESSION superglobal and put them in a hidden field
$newToken = generateFormToken('form1');
?>
Then put the token as a hidden input into the form itself:
<input name="token" type="hidden" value="<?php echo $newToken; ?>">
Now we are prepared to check the token values against each other when the form is submitted. We’ll create a function to do that.
function verifyFormToken($form) {
// check if a session is started and a token is transmitted, if not return an error
if (!isset($_SESSION[$form.'_token'])) {
return false;
}
// check if the form is sent with token in it
if (!isset($_POST['token'])) {
return false;
}
// compare the tokens against each other if they are still the same
if ($_SESSION[$form.'_token'] !== $_POST['token']) {
return false;
}
return true;
}
This function checks to see if the token exists in both required places and that they match. If all those three things are true, the function returns true, if not, it returns false. Now we check that value before proceeding. The basic structure is:
if (verifyFormToken('form1')) {
// ... more security testing
// if pass, send email
} else {
echo "Hack-Attempt detected. Got ya!.";
writeLog('Formtoken');
}
Hack Logging
Notice in the structure above we use a function called writeLog() that takes a string. There are a variety of circumstances in which we have detected foul play and need to stop. In those cases we will call the writeLog() function to log the error for our own reference and then die()
This function attempts to write to a text file on the server (that file is going to need proper file permissions, user writeable) or if that fails, it will email it to you.
function writeLog($where) {
$ip = $_SERVER["REMOTE_ADDR"]; // Get the IP from superglobal
$host = gethostbyaddr($ip); // Try to locate the host of the attack
$date = date("d M Y");
// create a logging message with php heredoc syntax
$logging = <<<LOG
\n
<< Start of Message >>
There was a hacking attempt on your form. \n
Date of Attack: {$date}
IP-Adress: {$ip} \n
Host of Attacker: {$host}
Point of Attack: {$where}
<< End of Message >>
LOG;
// open log file
if($handle = fopen('hacklog.log', 'a')) {
fputs($handle, $logging); // write the Data to file
fclose($handle); // close the file
} else { // if first method is not working, for example because of wrong file permissions, email the data
$to = '[email protected]';
$subject = 'HACK ATTEMPT';
$header = 'From: [email protected]';
if (mail($to, $subject, $logging, $header)) {
echo "Sent notice to admin.";
}
}
}
Nothing POSTED we didn’t ask for
If any values get posted to us that don’t have names from inputs in our own form, something funky is definitely going on. We’ll build a “whitelist” of acceptable post names and then check each one.
// Building a whitelist array with keys which will send through the form, no others would be accepted later on
$whitelist = array('token','req-name','req-email','typeOfChange','urgency','URL-main','addURLS', 'curText', 'newText', 'save-stuff');
// Building an array with the $_POST-superglobal
foreach ($_POST as $key=>$item) {
// Check if the value $key (fieldname from $_POST) can be found in the whitelisting array, if not, die with a short message to the hacker
if (!in_array($key, $whitelist)) {
writeLog('Unknown form fields');
ie("Hack-Attempt detected. Please use only the fields in the form");
}
}
Valid URL
The client-side validation on this form watches for this, so it should be caught up front, but of course we should be checking on the back end too.
// Lets check the URL whether it's a real URL or not. if not, stop the script
if (!filter_var($_POST['URL-main'],FILTER_VALIDATE_URL)) {
writeLog('URL Validation');
die('Please insert a valid URL');
}
Cleaning values
At this point we have made all our security checks and are proceeding to create and send the email. Taking all the inputs exactly as entered and sending them on is a potential security risk. For example, JavaScript could be entered in a text field and then sent over email and potentially ran when the email is opened.
For the fields like “Name”, where there is no reason to use any special tags, we will strip them entirely with strip_tags(). For the textareas, where there may be occasion to use some tags, we’ll just the htmlentities() function to convert them safely.
Example:
$message .= "Name: " . strip_tags($_POST['req-name']) . "\n";
$message .= "NEW Content: " . htmlentities($_POST['newText']) . "\n";
More Better Cleaning
Krinkle writes in to say:
Originally you used strip_tags() for normal fields and
htmlentities()
for content. This is fine, except that it’s best practice to declare ENT_NOQUOTES and “UTF-8″ as well, since otherwise characters like the accent on the e (” é “) could become crap like Ã…@. And since most servers add slashes to input brought via $_POST[] it’s not a bad thing to run stripslashes just in case, else you’d have a slash in front of every single or doublequote that was typed entered in the form once it’s in the mailbox.
function stripcleantohtml($s){
// Restores the added slashes (ie.: " I\'m John " for security in output, and escapes them in htmlentities(ie.: " etc.)
// Also strips any <html> tags it may encounter
// Use: Anything that shouldn't contain html (pretty much everything that is not a textarea)
return htmlentities(trim(strip_tags(stripslashes($s)), ENT_NOQUOTES, "UTF-8"));
}
function cleantohtml($s){
// Restores the added slashes (ie.: " I\'m John " for security in output, and escapes them in htmlentities(ie.: " etc.)
// It preserves any <html> tags in that they are encoded aswell (like <html>)
// As an extra security, if people would try to inject tags that would become tags after stripping away bad characters,
// we do still strip tags but only after htmlentities, so any genuine code examples will stay
// Use: For input fields that may contain html, like a textarea
return strip_tags(htmlentities(trim(stripslashes($s)), ENT_NOQUOTES, "UTF-8"));
}
Usage in this example:
$message .= "Name: " . stripcleantohtml($_POST['req-name']) . "\n";
$message .= "NEW Content: " . cleantohtml($_POST['newText']) . "\n";
Why not use a CAPTCHA?
CAPTCHAs do a fairly good job of keeping spam off of forms, but we were worried more about hacking here than spam. Also, since this form is for people that we probably LIKE and are trying to HELP, we aren’t going to put them through the annoying hoop-jumping of a CAPTCHA. However, if you are interested, a super-duper simple home-brew captcha is to ask something like “What is ten minus five?” in a text input and then check for the values “5” and any capital letter combination of “five” and if it’s a match then continue, if not, don’t. This version of the form already has writeLog() function which is ready to use just for this, and the logic would fit in nicely around lines 75 and 76.
If you’d like a little more robust CAPTCHA, check out reCAPTCHA, which is pretty easy to use, helps people, and is very accessible.
Security Gurus?
When we talk about security around here, there tends to be a lot of opinions on how things are being done. If you have ideas, please share them below as constructively as you can, and I’ll be digesting and seeing how we can improve as we go along.
Nice post, and quite important to every webdesigner that creates some sort of user-input.
One thing: the Md5 encryption.. This is generally called ‘not reaaal safe, better use SHA-256bit encryption’. But how to do this? There should be some sha256(string) function in php, but I can’t get this to work.
First google result: http://www.nanolink.ca/pub/sha256/
Yea I noticed that one. But I want to use it in my own php files, and it’s not working :-/ like echo sha256(‘hello’); doesn’t output anything here. I have googled a lot, don’t worry. I was actually looking for replies like ‘oh he just forgot to..’ because I think it’s some simple thing I didn’t think of. But thanks anyway! :-)
I believe this is what you’re looking for.
The (anti-csrf) token is stored in a session. And a session is only stored for a half hour. You cannot get a md5 collision in that half hour.
And md5 is not an encryption it’s hash, therefore it cannot be decrypted. :)
The token is used to protect the user from csrf attacks. It is just a random string that is send with the form data as some sort of ID that others don’t have. Without the correct token/ID the form is not processed.
But it can be looked up. Think dictionary attack.
But just sending POST request to the form will not work since you need a token first. To get that token you have to visit the site (GET request).
So sending POST’s in the form of a dictonary attack won’t work.
Oh yes. There’s this website that has a database containing the values and their hashes combined with a total number of like 1.2 billion entries. That is sick.
Excellent post – every webmaster should work through these steps before looking at Captcha solutions.
Aside from providing a better user experience, there is often a tangible business benefit – we recently stripped the ReCAPTCHA from our registration form and used some of these techniques. This gave us a double digit lift in the number of conversions we were getting.
That was a great idea. I like it. Hashes are great.
I have a question that is a little off topic. Why is it that alot of site designer’s sites that I have been to lately do not use target=”_blank” in their anchor tags? I keeps external sites separate from your own and causes less confusion to some general “surfers”.
Well done, I really enjoyed reading this. Most websites aren’t secured in any way, but this kicks everything in the right place. Also don’t forget stuff like HTTPS and Salting stuff.
Keep up the good work!
You use the hash() function.
See Here
Example:
hash(“sha256”, “My string to be hashed”);
We pulled CAPTCHA validation and went with Wufoo to supply forms and saw a huge jump in forms submitted (like double). This is great as my boss is questioning the monthly cost and wanted to know about alternatives.
Nice post! I think the token bit is great. Not only is it useful to make sure a form is submitted from the page that generated it, it’s also good to be used with AJAX requests to make sure that the request was done by the intended page.
Great writup Chris. Always looking for workarounds for CAPTCHA. They are an assault on the conversion process and an insult to usability.
I was able to insert SQL into one of the text boxes.
the form sends an email and you injected SQL? Good job, mate!
Will you post what you inputted and into what field? I would like to test on my own forms as well for solid SQL injection attempts.
There is no such thing as SQL-injection in a PHP-file that doesn’t touch a database.
What you’ve done here is a pretty good countermeasure, but note that there are sneaky ways of injecting javascript without tags (or what we normally think of as tags).
At some point, it is always possible to hack. The idea is to make it so hard that no one will bother.
Clear the token after a successful submit to counter-act people “bouncing” on F5 and re-submitting the same form over and over again.
I’m using reCAPTCHA on a site and having pretty good success. Maybe I might try this if I can find the time…
“What this does is ensure that any submission of the form is our form and not some third-party script wailing away at us from a different server.”
Why not use PHP’s $_SERVER[‘HTTP_REFERER’] to see if the request came from your form page?
Because it can be spoofed by the client.
Because the referer field can easily be tampered with. The client can send any value the user wants it to. My opera for example doesn’t send any referer information at all.
Making an extra validation on HTTP_REFERRER wouldn’t hurt so go for it if anyone is thinking about it. Its not like all spammers are smart anyway coughs
interesting post you got here. I’ll check it out soon.
indeed, an interesting post.
Great topic Chris! With wufoo outages like right now, it is time to start coding forms myself.
This will come in handy!
NEVER, ever, ever, ever, ever, ever, ever, ever, ever use MD5 as your algorithm of choice! Never! …except without a condom.
SHA1 is much safer than MD5, but both can be bruteforced, MD5 much easier than SHA1.
Though, through all hashes, you should use a SALT (the condom I was talking about). This can make both an MD5 and SHA1 more secure. Instead of taking 200,000 tries with MD5, or nearly a million tries with SHA1, it would take roughly 800 trillion tries to correctly guess an SHA1 + prepended SALT.
Therefore, if you’re going to do any fooling around, wear a Salt condom.
would this be an even better solution ?
any thougths on this ?
$salt = time();
$string = uniqid(microtime(), true);
switch( substr( $salt, -1, 1) ) {
case 1:
$token = sha1( $salt . $string ) . substr( $salt , 1, 3);
break;
case 2:
$token = sha1( $string . $salt ) . substr( $salt , 3, 2);
break;
case 3:
$token = substr( $salt , 5, 2) . sha1( $salt . $string ) ;
break;
}
Better or no ?
Much better, but I also found this on the sha1 PHP Manual (no credit to me, I didn’t come up with this), that also seems to do a good job at creating a more difficult pseudo salt..
<?php
function createHash($inText, $saltHash=NULL, $mode='sha1'){
// hash the text //
$textHash = hash($mode, $inText);
// set where salt will appear in hash //
$saltStart = strlen($inText);
// if no salt given create random one //
if($saltHash == NULL) {
$saltHash = hash($mode, uniqid(rand(), true));
}
// add salt into text hash at pass length position and hash it //
if($saltStart > 0 && $saltStart < strlen($saltHash)) {
$textHashStart = substr($textHash,0,$saltStart);
$textHashEnd = substr($textHash,$saltStart,strlen($saltHash));
$outHash = hash($mode, $textHashEnd.$saltHash.$textHashStart);
} elseif($saltStart > (strlen($saltHash)-1)) {
$outHash = hash($mode, $textHash.$saltHash);
} else {
$outHash = hash($mode, $saltHash.$textHash);
}
// put salt at front of hash //
$output = $saltHash.$outHash;
return $output;
}
?>
Since rand() uses time to create it’s pseudo random sum, I think this would also be a fine way to create a salt.
To fight hacking even more, try to use a form timeout.
I wrote about this in an article here: serversidemagazine.com/php/php-security-measures-against-csrf-attacks
Oh, and also in the comments of that article a few readers toggled with the idea of AJAX CSRF attacks, which is a pretty new hacking method.
This was an interesting read and a great choice of topic. The token and the white list are interesting ideas. I’ll definitely consider using these techniques next time I build a form.
Thanks Chris!
Genius!
Hey i thought I would point out that in the ValidURL checker if i typed in:
http://www.somesite.com
It returned an invalid url, but if i added http:// it returned true.
I can see possible flaws in that routine, namely being that clients are probably not going to type in http:// because that think of a website link in its simplistic format “www.sitename.com”.
In applications i write i check to see if the string contains http:// – if it does leave the string as it is. if no http:// is found i add it using the string function str_replace.
I could be wrong in doing that but im a young web developer (20) and i only been doing this for a year but it seems logical.
looks great in firefox. in good ole IE 7 & 8 the drop down gets obscured…
wonder if a z-index snafoo might solve the problem…?
Most of these comments are lunacy. This method is fine for CSRF, but there’s nothing stopping anyone from running a script remotely, pulling the keys->values from the form and submitting it remotely.
This same exact method has been used in open source systems for years. It’s nothing new.
I agree with Danny – the use of MD5 is not an issue in this CRSF example.
For example, Django’s CRSF middleware uses MD5:
http://code.djangoproject.com/browser/django/trunk/django/contrib/csrf/middleware.py#L27
There is no security flaw in this implementation.
In general these comments confirm the obivous – don’t try to implement CSRF yourself – you will get it wrong. Instead use a well known/support library. Django uses the CRSF middleware – other languages and frameworks undoubtedly have their own implementations.
Very informative article. As a side note, on your form
Additional URL's / Areas:
Should be
Additional URLs / Areas:
No apostrophe. :)
Hi Chris,
people talk obviously a lot of development here – I’m thinking about the usability of the form. There are certain things I miss or I see as handicap:
1) Dropping an E-mail is faster then filling out/sending a form for the client.
2) Some clients do prepare changes in documents (PDF, Word). Why should they use a form?
2) Form can’t handle complex changes (multiple changes on multiple pages, multiple related changes, design changes, …)
3) Form can not upload and send media files (documents, images)
just to tell a few. I think that most clients will not use a form for changes.
cheers
Hi Chris,
Nice tut on form security, does this serve as alternative for captcha can it block the bots as well.
Ciao
Hi,
One more question, does this method protect forms from xrumer.
Thnx
why doesn’t this form send? I changed the appropriate emails?
CSRF protection isnt’ foolproof though – it only takes a GET request to the form page, parsing out the token and tranferring it and the session information to the POST request to circumvent. Given there’s spam bots that work by filling in the form and submitting it (rather than just submitting it) this approach needs to be coupled with another one.
I downloaded the demo file. Thanks for sharing that. But something seems incomplete at the end of the PHP stuff, right after the mail() function, just before the html begins. There’s an “if” clause with no body. Don’t know how this blog handles code but I’ll try:
} else {
if (!isset($_SESSION[$form.’_token’])) {
//shouldn’t something go here?
} else {
echo “Hack-Attempt detected. Got ya!.”;
writeLog(‘Formtoken’);
}
wow
that’s weird. I wrote a post and it wound up 4 posts above the last one. never seen that happen. Anyway… 4 posts up there’s a question about an empty “if” conditional in the demo file. Can anyone help me figure out why it’s there and what it might be missing? Thanks!
Ahh never mind. Weirdness. My posts written July 2 are appearing before ones posted May 19 and 20.
Just a thought but you are probably better with setting the name of the input as the token. That way if someone try to get the token with a get request. the name of the input is different. Might be a little better anyway.
Also, md5 sha1, whatever. I don’t believe that it should matter at all. There is no encrypting information here so no worries about that. you are just verifying that it is the same. Al that should really matter is that the token is fairly unique and can be verified to be the same per request.
hacker can easily get token value from input using regular expression.
ex: open our login page using ajax or curl and using regular expression get token value from hidden input.
how to prevent this.
hi, for all those who wonder how to stop a hacker get the token, you can use
thank you, that saves the cookie as Whirpool and does it securely, and http only (No Ajax request)
i am not an expert, but if it helps i don’t know.