Thursday, January 17, 2013

Reverse Engineer USB protocol of Canon 4200F (part 1)

So far, there is no Linux driver support available for Canon Scanner 4200F, albeit with the latest and greatest kernel.  Blame the problem to Canon who doesn't want to support Linux community.

With the spirit of hacking, I now try to reverse-engineer the driver (which is only available for Windows, and probably iOS) to be able to write its bare minimum linux driver. I will try to post progress here, so you will see in the title "part x".

First, do this:


mount -t debugfs none_debugs /sys/kernel/debug
modprobe usbmon


do:

lsusb

On my system, it reveals (partially copied here):
..

Bus 001 Device 004: ID 056a:0017 Wacom Co., Ltd Bamboo Fun 4x5
Bus 001 Device 005: ID 046d:08d7 Logitech, Inc. QuickCam Communicate STX
Bus 001 Device 006: ID 04a9:221b Canon, Inc. CanoScan 4200F
Bus 001 Device 007: ID 046d:c52b Logitech, Inc. Unifying Receiver
...

Or, with the class hierarchy:


root@HP-m9000t:~# lsusb -t
1-3.4:1.0: No such file or directory
1-3.7.3:1.0: No such file or directory
/:  Bus 08.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 07.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 06.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 05.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=ehci_hcd/8p, 480M
    |__ Port 4: Dev 2, If 0, Class=hub, Driver=hub/4p, 480M
        |__ Port 1: Dev 4, If 0, Class=vend., Driver=mceusb, 1.5M
        |__ Port 1: Dev 4, If 1, Class=HID, Driver=usbhid, 1.5M
        |__ Port 2: Dev 5, If 0, Class=stor., Driver=usb-storage, 480M
    |__ Port 8: Dev 3, If 0, Class=hub, Driver=hub/2p, 480M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=ehci_hcd/4p, 480M
    |__ Port 3: Dev 2, If 0, Class=hub, Driver=hub/7p, 480M
        |__ Port 1: Dev 3, If 0, Class=hub, Driver=hub/2p, 12M
            |__ Port 1: Dev 9, If 0, Class=HID, Driver=usbhid, 12M
            |__ Port 1: Dev 9, If 1, Class=HID, Driver=usbhid, 12M
            |__ Port 2: Dev 10, If 0, Class=audio, Driver=snd-usb-audio, 12M
            |__ Port 2: Dev 10, If 1, Class=audio, Driver=snd-usb-audio, 12M
            |__ Port 2: Dev 10, If 2, Class=audio, Driver=snd-usb-audio, 12M
        |__ Port 2: Dev 4, If 0, Class=HID, Driver=wacom, 12M
        |__ Port 3: Dev 5, If 0, Class=vend., Driver=gspca_zc3xx, 12M
        |__ Port 3: Dev 5, If 1, Class=audio, Driver=snd-usb-audio, 12M
        |__ Port 3: Dev 5, If 2, Class=audio, Driver=snd-usb-audio, 12M
        |__ Port 4: Dev 6, If 0, Class=vend., Driver=, 480M
        |__ Port 6: Dev 7, If 0, Class=HID, Driver=usbhid, 12M
        |__ Port 6: Dev 7, If 1, Class=HID, Driver=usbhid, 12M
        |__ Port 6: Dev 7, If 2, Class=HID, Driver=usbhid, 12M
        |__ Port 7: Dev 8, If 0, Class=hub, Driver=hub/4p, 12M
            |__ Port 2: Dev 11, If 0, Class=stor., Driver=usb-storage, 12M
            |__ Port 3: Dev 12, If 0, Class=vend., Driver=, 12M
            |__ Port 3: Dev 12, If 1, Class=vend., Driver=ftdi_sio, 12M
root@HP-m9000t:~# 

As can be seen above, there is no driver available for the scanner.



Then open with a text editor this file: /sys/kernel/debug/usb/devices.  Search for "Canon" (or search by Vendor and product ID found above).


T:  Bus=01 Lev=02 Prnt=02 Port=03 Cnt=04 Dev#=  6 Spd=480  MxCh= 0
D:  Ver= 2.00 Cls=ff(vend.) Sub=ff Prot=ff MxPS=64 #Cfgs=  1
P:  Vendor=04a9 ProdID=221b Rev= 2.00
S:  Manufacturer=Canon
S:  Product=CanoScan
C:* #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr= 10mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=83(I) Atr=03(Int.) MxPS=   1 Ivl=16ms

From the information above, we can see that the scanner uses three endpoints:

  • One OUT bulk endpoint
  • One IN bulk endpoint
  • One IN interrupt endpoint
We then can tell our PC uses OUT bulk for general control dan normal data to send to the scanner, uses  IN bulk endpoint to receive scanned/preview image from the scanner, and IN interrupt endpoint for receiving other interrupts (e.g, when any of the scanner's buttons is pressed)

How to use usbmon?

The captured data are stored in /sys/kernel/debug/usb/usbmon/<file>, where file can be "0u" to capture packets in all buses, and <bus#>u for a specific bus.  Where can we find the bus number information? See the content of file device above!.  The first line says "Bus=01", it means to capture packets to/from the scanner, we just need to cat file "1u" (the detail instruction/documentation about usbmon can be read here: http://lxr.linux.no/linux+v2.6.28.8/Documentation/usb/usbmon.txt)

An easier way is to use Wireshark (as root, because we need to gain access to usb devices).  Start capture for sometime and stop it.  Apply filter " usb.idVendor == 0x4a9 && usb.idProduct == 0x221b" or "usb.device_address == 6 && usb.bus_id==1".  In my case, I know the scanner is address 6 and bus=1.  Save the captured data to a file (select "Wireshark/tcpdump" format).  

To read it:

tshark -r pcapfile -T fields -V -e usb.capdata

or

tshark -r pcapfile -T fields -x

For example:

root@HP-m9000t:~# tshark -P -r ./usbscanner1.pcap -x -V -R "usb.bus_id==1 && usb.device_address==6"

tshark: Lua: Error during loading:
 [string "/usr/share/wireshark/init.lua"]:45: dofile has been disabled
Running as user "root" and group "root". This could be dangerous.
Frame 528: 64 bytes on wire (512 bits), 64 bytes captured (512 bits)
    WTAP_ENCAP: 115
    Arrival Time: Jan 17, 2013 22:29:25.703289000 PST
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1358490565.703289000 seconds
    [Time delta from previous captured frame: 0.000025000 seconds]
    [Time delta from previous displayed frame: 0.000000000 seconds]
    [Time since reference or first frame: 2.030187000 seconds]
    Frame Number: 528
    Frame Length: 64 bytes (512 bits)
    Capture Length: 64 bytes (512 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: usb]
USB URB
    URB id: 0xffff88013280d240
    URB type: URB_SUBMIT ('S')
    URB transfer type: URB_CONTROL (0x02)
    Endpoint: 0x80, Direction: IN
        1... .... = Direction: IN (1)
        .000 0000 = Endpoint value: 0
    Device: 6
    URB bus id: 1
    Device setup request: relevant (0)
    Data: not present ('<')
    URB sec: 1358490565
    URB usec: 703289
    URB status: Operation now in progress (-EINPROGRESS) (-115)
    URB length [bytes]: 40
    Data length [bytes]: 0
URB setup
    bmRequestType: 0x80
        1... .... = Direction: Device-to-host
        .00. .... = Type: Standard (0x00)
        ...0 0000 = Recipient: Device (0x00)
    bRequest: GET DESCRIPTOR (6)
    Descriptor Index: 0x00
    bDescriptorType: DEVICE (1)
    Language Id: no language specified (0x0000)
    wLength: 40

0000  40 d2 80 32 01 88 ff ff 53 02 80 06 01 00 00 3c   @..2....S......<
0010  c5 eb f8 50 00 00 00 00 39 bb 0a 00 8d ff ff ff   ...P....9.......
0020  28 00 00 00 00 00 00 00 80 06 00 01 00 00 28 00   (.............(.
0030  00 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00   ................

Frame 529: 82 bytes on wire (656 bits), 82 bytes captured (656 bits)
    WTAP_ENCAP: 115
    Arrival Time: Jan 17, 2013 22:29:25.704138000 PST
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1358490565.704138000 seconds
    [Time delta from previous captured frame: 0.000849000 seconds]
    [Time delta from previous displayed frame: 0.000849000 seconds]
    [Time since reference or first frame: 2.031036000 seconds]
    Frame Number: 529
    Frame Length: 82 bytes (656 bits)
    Capture Length: 82 bytes (656 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: usb]
USB URB
    URB id: 0xffff88013280d240
    URB type: URB_COMPLETE ('C')
    URB transfer type: URB_CONTROL (0x02)
    Endpoint: 0x80, Direction: IN
        1... .... = Direction: IN (1)
        .000 0000 = Endpoint value: 0
    Device: 6
    URB bus id: 1
    Device setup request: not relevant ('-')
    Data: present (0)
    URB sec: 1358490565
    URB usec: 704138
    URB status: Success (0)
    URB length [bytes]: 18
    Data length [bytes]: 18
    [Request in: 528]
    [Time from request: 0.000849000 seconds]
    [bInterfaceClass: Unknown (0xffff)]
DEVICE DESCRIPTOR
    bLength: 18
    bDescriptorType: DEVICE (1)
    bcdUSB: 0x0200
    bDeviceClass: VENDOR_SPECIFIC (0xff)
    bDeviceSubClass: 255
    bDeviceProtocol: 255
    bMaxPacketSize0: 64
    idVendor: 0x04a9
    idProduct: 0x221b
    bcdDevice: 0x0200
    iManufacturer: 3
    iProduct: 4
    iSerialNumber: 0
    bNumConfigurations: 1

0000  40 d2 80 32 01 88 ff ff 43 02 80 06 01 00 2d 00   @..2....C.....-.
0010  c5 eb f8 50 00 00 00 00 8a be 0a 00 00 00 00 00   ...P............
0020  12 00 00 00 12 00 00 00 00 00 00 00 00 00 00 00   ................
0030  00 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00   ................
0040  12 01 00 02 ff ff ff 40 a9 04 1b 22 00 02 03 04   .......@..."....
0050  00 01                            

usb.capdata is one of the fields.  There are many other fields we can display.  Check WireShark documentation http://www.wireshark.org/docs/dfref/u/usb.html for more detail.

Because there is still no clue from the collected data above (except information about host queried the scanner and the scanner responded back), I will try again on Windows, where there the original driver has been installed.

(to be continued...)



Sunday, January 13, 2013

FriendlyARM

My new Samsung S5VP210 based arm kit is now up and running on Android 4.0.3 (ICS).



Monday, January 7, 2013

Mini210S-SDK43 vs. Beagleboard Rev B2


Beagleboard:


root@beagleboard:~# cat /proc/cpuinfo
Processor       : ARMv7 Processor rev 3 (v7l)
BogoMIPS        : 568.23
Features        : swp half thumb fastmult vfp edsp thumbee neon vfpv3 
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x1
CPU part        : 0xc08
CPU revision    : 3

Hardware        : OMAP3 Beagle Board
Revision        : 0020
Serial          : 0000000000000000
root@beagleboard:~# cat /proc/mtd
dev:    size   erasesize  name
mtd0: 00080000 00020000 "X-Loader"
mtd1: 001e0000 00020000 "U-Boot"
mtd2: 00020000 00020000 "U-Boot Env"
mtd3: 00400000 00020000 "Kernel"
mtd4: 0f980000 00020000 "File System"
root@beagleboard:~# free-m
-bash: free-m: command not found
root@beagleboard:~# free -m
             total       used       free     shared    buffers     cached
Mem:           106         96          9          0          3         33
-/+ buffers/cache:         59         47
Swap:            0          0          0
root@beagleboard:~# 


Mini210S:

/system/busybox/bin # uname -a
Linux FriendlyARM 3.0.8-FriendlyARM #1 PREEMPT Sat Oct 27 15:57:19 CST 2012 armv7l GNU/Linux
/system/busybox/bin # cat /proc/cpuinfo
Processor       : ARMv7 Processor rev 2 (v7l)
BogoMIPS        : 994.84
Features        : swp half thumb fastmult vfp edsp thumbee neon vfpv3 
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0xc08
CPU revision    : 2

Hardware        : MINI210
Revision        : 0000
Serial          : 0000000000000000
/system/busybox/bin # 

/system/busybox/bin # cat /proc/mtd
dev:    size   erasesize  name
mtd0: 00040000 00100000 "misc"
mtd1: 00500000 00100000 "recovery"
mtd2: 00500000 00100000 "kernel"
mtd3: 00300000 00100000 "ramdisk"
mtd4: 7f200000 00100000 "system"
/system/busybox/bin #

Wednesday, November 21, 2012

Lazy Fox fun set



import sets

magicc=sets.Set('the quick brown fox jumps over a lazy dog')
>>> sorted(magicc)
[' ', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
>>> 



Friday, October 26, 2012

Testing Unicode on this blog

Just recently made some small change in this blog's template.  I now implement unicode encoding.  See any changes?

If you right click and select "view source of this page", you should see html tag "<meta content='text/html; charset=UTF-8' http-equiv='Content-Type'/>", which should now support UTF-8 encoding.

Sunday, October 21, 2012

Get Geodata (latitude, longitude, etc.) in Python


I let the API keys empty, 'cause I don't want to expose my keys. You need to register to infodb.com and google to get API keys !!


#!/usr/bin/python

import os
import sys
import socket
import httplib
import xml.dom.minidom as minidom
from googlemaps import GoogleMaps


""" Get googlemap API module from: http://pypi.python.org/pypi/googlemaps/1.0.2

    To get Googlemap API Key:
    https://code.google.com/apis/console/?pli=1#project:255524978890:access

"""

lat=38.4193
lon=-122.6978
myip = "99.9.21x.xx"  (hidden for the purpose of this blog)
debug=0


#time

lat2timeURL="www.earthtools.org"
#lat2timeFile="/timezone/" + str(lat) + '/' + str(lon)

lat2heightURL="www.earthtools.org"
#lat2heightFile="/height/" + str(lat) + "/" + str(lon)

infodb_apikey = "get your own key"
gmap_apikey = "get your own key"

ip2lat2URL = "api.ipinfodb.com"
ip2lat2File = "/v3/ip-city/?key=" + infodb_apikey + "&format=xml&ip="



def getText(nodeList):
    rc = []
    for node in nodeList:
        if node.nodeType == node.TEXT_NODE:
            rc.append(node.data)
    return ''.join(rc)

def getXmlElementFromObj(obj,  name):
    objList = obj.getElementsByTagName(name)[0]
    if objList:
        return getText(objList.childNodes)
    else:
        return ""


def DisplayUsage():
    sys.stderr.write("\nUsage: %s <address in double-quotes>\n\n" % sys.argv[0])


def EarthData(address):
    try:
        gmaps = GoogleMaps(gmap_apikey)
        lat, lon = gmaps.address_to_latlng(address)
    except:
        sys.stderr.write("\nUnable to query GoogleMaps\n")
        sys.exit(1)

    lat2timeFile="/timezone/" + str(lat) + '/' + str(lon)
    lat2heightFile="/height/" + str(lat) + "/" + str(lon)

    conn = httplib.HTTPConnection(lat2timeURL)
    conn.request("GET", lat2timeFile)
    resp = conn.getresponse()
    if resp.status == 200:
        data = resp.read()
        if debug:
            print data
        xml = minidom.parseString(data)
        timezoneObj = xml.getElementsByTagName("timezone")
        for tmObj in timezoneObj:
            nodes = tmObj.childNodes
            version = getXmlElementFromObj(tmObj, "version")
            lat = getXmlElementFromObj(tmObj, "latitude")
            lon = getXmlElementFromObj(tmObj, "longitude")
            localtime = getXmlElementFromObj(tmObj, "localtime")
            isotime = getXmlElementFromObj(tmObj, "isotime")
            utctime = getXmlElementFromObj(tmObj, "utctime")
            #print "version=%s" % version
            if debug:
                print "latitude  : %s" % lat
                print "longitude : %s" % lon
                print "localtime : %s" % localtime
    conn.close()

    conn = httplib.HTTPConnection(lat2heightURL)
    conn.request("GET", lat2heightFile)
    resp = conn.getresponse()
    if resp.status == 200:
        data = resp.read()
        if debug:
            print data
        xml = minidom.parseString(data)
        hObj = xml.getElementsByTagName("height")
        for h in hObj:
            meter = getText(h.getElementsByTagName("meters")[0].childNodes)
            feet = getText(h.getElementsByTagName("feet")[0].childNodes)
            if debug:
                print "Sea-level : %s meters = %s feet" % (meter, feet)

    conn.close()
    return (lat, lon, localtime, meter, feet)


def GetPublicIp(name):
    myip = str(socket.gethostbyname(name))

    iplatURL="api.hostip.info"
    ip2latFile="/?ip=" + myip + "&position=true"
    if debug:
        print "IP Address: %s" % myip
    ip2lat2File += myip
    conn = httplib.HTTPConnection(ip2lat2URL)
    conn.request("GET", ip2lat2File)
    resp = conn.getresponse()
    if resp.status == 200:
        data = resp.read()
        xml = minidom.parseString(data)
        #print data
        locObj = xml.getElementsByTagName("Response")
        for loc in locObj:
            nodes = loc.childNodes
            status = getXmlElementFromObj(loc,"statusCode")
            if status == "OK":
                lat = getXmlElementFromObj(loc, "latitude")
                lon = getXmlElementFromObj(loc, "longitude")
                countryCode = getXmlElementFromObj(loc, "countryCode")
                countryName = getXmlElementFromObj(loc, "countryName")
                regionName = getXmlElementFromObj(loc, "regionName")
                cityName = getXmlElementFromObj(loc, "cityName")
                zipCode = getXmlElementFromObj(loc, "zipCode")
                timezone = getXmlElementFromObj(loc, "timeZone")
                print "Address   : %s %s, %s %s" % (cityName, str(zipCode), regionName, countryName )
    conn.close()

"============================== MAIN =================================="
if __name__ == "__main__":
    if len(sys.argv) < 2:
        DisplayUsage()
        sys.exit(1)
    else:
        addr = sys.argv[1]
        print "Querying %s" % addr
        (lat, lon, ltime, meter, feet) = EarthData(addr)

    print "========================="
    print "Address   : %s" % addr
    print "Latitude  : %s" % lat
    print "Longitude : %s" % lon
    print "Local-time: %s" % ltime
    print "Sea-level : %s meters (%s feet)" % (meter,  feet)

#    print "ip2Lat2=%s%s" % (ip2lat2URL, ip2lat2File)


Example (assume the script above is saved as "earthdata.py"):


bash-4.2$ ./earthdata.py "1450 N McDowell Blvd, Petaluma, CA 94954"
Querying 1450 N McDowell Blvd, Petaluma, CA 94954
=========================
Address   : 1450 N McDowell Blvd, Petaluma, CA 94954
Latitude  : 38.279534
Longitude : -122.6709223
Local-time: 21 Oct 2012 16:47:33
Sea-level : 12 meters (39.4 feet)
bash-4.2$