Response encoding with node.js "request" module

Problem

I am trying to get data from the Bing search API, and since the existing libraries seem to be based on old discontinued APIs I though I'd try myself using the request library, which appears to be the most common library for this. My code looks like

var SKEY           =  "myKey...." , 
    ServiceRootURL =  'https://api.datamarket.azure.com/Bing/Search/v1/Composite';

function getBingData(query, top, skip, cb) {
    var params = {
         Sources: "'web'", 
         Query: "'"+query+"'", 
         '$format': "JSON", 
         '$top': top, '$skip': skip
       },
       req = request.get(ServiceRootURL).auth(SKEY, SKEY, false).qs(params);
    request(req, cb)
}

getBingData("bookline.hu", 50, 0, someCallbackWhichParsesTheBody)

Bing returns some JSON and I can work with it sometimes but if the response body contains a large amount of non ASCII characters JSON.parse complains that the string is malformed. I tried switching to an ATOM content type, but there was no difference, the xml was invalid. Inspecting the response body as available in the request() callback actually shows bad code.

So I tried the same request with some python code, and that appears to work fine all the time. For reference:

r = requests.get(
       'https://api.datamarket.azure.com/Bing/Search/v1/Composite?Sources=%27web%27&Query=%27sexy%20cosplay%20girls%27&$format=json', 
        auth=HTTPBasicAuth(SKEY,SKEY))
stuffWithResponse(r.json())

I am unable to reproduce the problem with smaller responses (e.g. limiting the number of results) and unable to identify a single result which causes the issue (by stepping up the offset). My impression is that the response gets read in chunks, transcoded somehow and reassembled back in a bad way, which means the json/atom data becomes invalid if some multibyte character gets split, which happens on larger responses but not small ones.

Being new to node, I am not sure if there is something I should be doing (setting the encoding somewhere? Bing returns UTF-8, so this doesn't seem needed).

Anyone has any idea of what is going on?

FWIW, I'm on OSX 10.8, node is v0.8.20 installed via macports, request is v2.14.0 installed via npm.

Problem courtesy of: riffraff

Solution

i'm not sure about the request library but the default nodejs one works well for me. It also seems a lot easier to read than your library and does indeed come back in chunks.

http://nodejs.org/api/http.html#http_http_request_options_callback or for https (like your req) http://nodejs.org/api/https.html#https_https_request_options_callback (the same really though)

For the options a little tip: use url parse

var url = require('url');

var params = '{}'

var dataURL = url.parse(ServiceRootURL);
var post_options = {  
    hostname: dataURL.hostname,
    port: dataURL.port || 80,
    path: dataURL.path,
    method: 'GET',  
    headers: {  
        'Content-Type': 'application/json; charset=utf-8',  
        'Content-Length': params.length  
    }  
};

obviously params needs to be the data you want to send

Solution courtesy of: rob_james

Discussion

I think your request authentication is incorrect. Authentication has to be provided before request.get. See the documentation for request HTTP authentication. qs is an object that has to be passed to request options just like url and auth. Also you are using same req for second request. You should know that request.get returns a stream for the GET of url given. Your next request using req will go wrong.

If you only need HTTPBasicAuth, this should also work

//remove req = request.get and subsequent request
request.get('http://some.server.com/', {
  'auth': {
    'user': 'username',
    'pass': 'password',
    'sendImmediately': false
  }
 },function (error, response, body) {
});

The callback argument gets 3 arguments. The first is an error when applicable (usually from the http.Client option not the http.ClientRequest object). The second is an http.ClientResponse object. The third is the response body String or Buffer. The second object is the response stream. To use it you must use events 'data', 'end', 'error' and 'close'.

Be sure to use the arguments correctly.

Discussion courtesy of: user568109

You have to pass the option {json:true} to enable json parsing of the response

Discussion courtesy of: Duane Fields

This recipe can be found in it's original form on Stack Over Flow.