Sanitizing & Validating User Input for Amazon Product Advertising API

Summary:  Remove apostrophes to prevent Amazon from returning “No Results”

As a security-conscious web programmer, I always sanitize user input. It’s something you have to do to prevent attacks like SQL Injection and Cross-Site Scripting.  Basically it means that you never trust user input to be safe, and you always filter or sanitize it to remove potentially dangerous input.   PHP has lots of filters for sanitization, so that part is easy.  For Amazon item search, my PHP code filters the user input like this:


if(isset($_POST['title']) && !empty($_POST['title'])) {
     $search_values['Title']= filter_var($_POST['title'], FILTER_SANITIZE_STRING);
}

But there’s a second step.   In addition to sanitizing data, you also want to validate it to make sure that it’s in the expected format.  But, in order to validate correctly, you have to understand the rules about what’s expected. Sometimes the rules are obvious, but other times you’re working with a bit of a black box.

Third party APIs are like black boxes.  You can’t read the code.   You only have the documentation to rely upon.   Giving the wrong input will get you errors or no results, and you might not know what you’re doing wrong.  So, you need to experiment and see if you can figure out what’s allowed and what’s not.

This post is specific to the Amazon Product Advertising API, which is an Amazon service that allows you to look up product information, like availability, pricing, keyword search, etc.

I use the Amazon Product Advertising API for several different projects, and for one of them, my application uses the ItemSearch operation. Specifically, I often search the Books category by Title.

Amazon’s API behaves differently than the search function on their site.  On their site, if you put in an apostrophe, the site just ignores it.  When using the API, you get no results.

So, the solution is to always strip the user input of all apostrophes. Or, if you’re a fellow programmer, you might call them single quotes. If you leave the apostrophe in, you’ll get no results. If you remove it, you hopefully get lots of results!

In my program, I am sending the user input via an HTML Form using Ajax.  My browser actually encodes the user input before sending it to the server.  It encodes the apostrophe character as its html entity which is '.

So, in my back end PHP code, I remove the apostrophe with this:


str_replace("'", "", $my_search_string);

Problem solved!   In actuality, I have combined that function with my code for sanitization, and the whole thing looks like:

 


if(isset($_POST['title']) && !empty($_POST['title'])) {
     $search_values['Title'] = str_replace("'", "", filter_var($_POST['title'], FILTER_SANITIZE_STRING));
}