Diagnosing and Fixing a Failing HTTPS API Request

Introduction This is a post to make a note of my recent journey of diagnosing and fixing a failing HTTPS API request. Background I have a user having issues with the Android app. It appears that the app has difficulty communicating with the server (API gateway). A generic connection error is shown to the user upon API call. Observation 0 It cannot be reproduced locally. None of my devices (phones and browsers) has errors communicating with the server.

Writing a Serverless Cron Job

Benefits of a serverless cron job What’s great about going serverless with a cron job is that there is no need to setup the environment, configure crontab, logging, error detection, etc. And you don’t have to worry about if the VM is down. It is also easier on your wallet. Here I am using AWS Lambda for my serverless cron job. Configuration and tips AWS Lambda function can be triggered periodically using CloudWatch events.

Debugging Python Slow json.loads

Background Profiling shows that pymongo’s bson.json_util.loads is consuming an unusual amount of CPU time. Benchmark To confirm the function is bad, let’s do some benchmarking. import pyperf def data(): import json return json.dumps(['asdfasdf%s' % i for i in xrange(20)]) s = data() runner = pyperf.Runner() runner.timeit(name="json", stmt="json.loads(s)", setup="from __main__ import s; import json;") runner.timeit(name="simplejson", stmt="simplejson.loads(s)", setup="from __main__ import s; import simplejson;") runner.timeit(name="bson json_util", stmt="json_util.loads(s)", setup="from __main__ import s; from bson import json_util;") Result:

Python mysqlclient Doesn't Work Well with gevent

mysqlclient (MySQLdb fork) is a Python MySQL driver which uses libmysqlclient (or libmariadbclient). This means speed and stability. Does it work well with gevent? Not really. In mysqlclient <= v1.3.13, there’s a sample called waiter_gevent.py which looks like this: from __future__ import print_function """Demo using Gevent with mysqlclient.""" import gevent.hub import MySQLdb def gevent_waiter(fd, hub=gevent.hub.get_hub()): hub.wait(hub.loop.io(fd, 1)) def f(n): conn = MySQLdb.connect(user='root', waiter=gevent_waiter) cur = conn.cursor() cur.execute("SELECT SLEEP(%s)", (n,)) cur.execute("SELECT 1+%s", (n,)) print(cur.

Fixing boto (AWS Python interface) Quadratic Runtime Connection Pool

Background I come across some Python code using boto, the Python interface to AWS. There’s boto3, which is a newer version of boto. If you are starting a new project, you should be looking at boto3 instead of boto. I am told that boto pools connections. This is a good thing in terms of performance. Never waste HTTPS connections because it is expensive to setup. But does it limit the number of outgoing connections like SQLAlchemy’s QueuePool, or is it configurable in terms of number of pooled connections?

Debugging Memory Usage in Python 2.7 with tracemalloc

Update 16 Mar 2019 Updated instructions. Update 16 Oct 2018 There was a bug in pytracemalloc that prevents the PYTHONTRACEMALLOC environment variable from working. I have submitted a pull request and it is now merged. The PR fixed the bug in Python patches, added a testing script for the patches and improved the documentation. Background My application is killed by the OOM killer. How do I find out why the application is taking up so much memory?

Python, SQLAlchemy, Pytest, PyCharm Debugger & NoSuchColumnError

There is this error when I use PyCharm debugger during pytest. NoSuchColumnError: "Could not locate column in row for column 'mytable.id'" This is a SQLAlchemy error. No useful help on Google. No error if the breakpoints are muted. Therefore, it is possibly related to lazy loading of relationships when displaying variables. This can be resolved by turning off the “load values asynchronously” option in PyCharm debugger. As far as I can remember, this is a new feature in PyCharm 2017.

Python & MySQL Interrupted System Call & pyinstrument

Background My Python application cannot connect to MySQL. The error message looks like this: Can’t connect to MySQL server on ‘127.0.0.1’ (4) Error 4 means Interrupted system call. I am using mysqlclient, the C wrapper MySQL connector. The error happens on both MySQL 5.6 and 5.7. It can be reproduced consistently. It seems that PyMySQL doesn’t have this problem. Also I am using gevent but it is not much related in this case.

Writing a Python Wrapper for html2text using cffi

Background I’ve fixed the html2text performance issue in last post, so now I can use it. I need to use it from Python, and that leaves me not many choices. Python by the C side, a blog post in the PayPal Engineering blog, has listed the options. C extension is hard to code and is not worth it. This post is about the experience and reflections about my first time using cffi.

Fixing html2text Quadratic Runtime

Background I come across this command line utility available in linux called html2text which was first written in 1999 and changed hands later. Obviously an old project, but a solid one. At least it handles a<div><br></div>b properly by outputing a\n\nb, instead of a\n\n\nb like most of the other converters out there. (I’m looking at you, Python html2text.) I download the source code of v1.3.2a from here and play around it.

gevent built-in function getaddrinfo failed with error

Problem Under heavy load using gevent, I see this: Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/gevent/threadpool.py", line 207, in _worker value = func(*args, **kwargs) error: [Errno 11] Resource temporarily unavailable (<ThreadPool at 0x7fe468930dd0 0/5/10>, <built-in function getaddrinfo>) failed with error Solution There’s no good solution out there. Actually, it is easier to solve than expected. You only have to change the gevent’s DNS resolver. In the doc, they didn’t clearly state the difference between the resolvers.

gevent.pywsgi's File Descriptor Leak and HTTP Keep-Alive

Background My server uses gevent.pywsgi. It works fine. However, every other few days the server will stop responding to requests. It says “Too many open files” in the logs. Investigation A simple lsof is showing that there are many socket connections opened by pywsgi even when those sessions are completed. This FD (File Descriptor) leak probably causes the process to reach the ulimit -n per-process number of open files limit.

Revisiting Qt (Again): Compiling Qt in Windows

Last time when I compile Qt, I really thought that it would be the last time. No, it isn’t. A user reported that Penguin Subtitle Player cannot be used under a 32-bit OS. Of course! I compiled it in a 64-bit environment. Time to compile Qt (statically) again in Windows! Steps: Install the required tools: Python 2, ActivePerl, Visual C++ Build Tools 2015 Get the Qt source and decompress it Pro tip 1: Don’t use Windows built-in decompression utility because it is horribly slow.

Amazon Alexa Skills Challenge and My Pocket

Update 21 Aug 2017 The skill is now open-source! Visit it on GitHub: github.com/carsonip/alexa-pocket Intro I don’t have much time for this blog post because I’ll be taking an early flight tomorrow. Maybe I’ll add something later. Hackathon To state the obvious, the hackathon is about building Alexa skills. I have submitted 2 entries: My Pocket and Tomato Helper. I’ll focus on My Pocket in this blog post. My Pocket The Alexa skill is about reading your saved pocket articles to you using Alexa.

Revisiting Penguin Subtitle Player and Qt

Update: There’s a new post about building Qt statically in Windows It has been a while since I last updated Penguin Subtitle Player. Anyways, after a few days of work, and more than a year of waiting since the last beta, here comes the first production release of Penguin Subtitle Player. Apart from developing multi-subtitle-format parser support for maximum flexibility and maintainability and fixing a few GitHub issues, most importantly, I have tidied up the project and code to meet the standard of a good open-source project.

Debugging rsync and ssh

Background I fall in love with rsync lately. It is particularly useful when I sync my hadoop stuff (scripts and input, which add up to a few GBs) between local and my hadoop cluster. After running the sync script for a few times, I cannot ssh to the machine anymore. This post is about how I debug it and the lessons learned. This is how the problematic script sync.sh looks like: (the IP is masked for obvious reasons) (warning: this script is faulty, do not use)

Slash Hack 2016

Hackathon It is the second and the last hackathon in my SF trip. The /hack hackathon organized by HackerEarth gives me an enjoyable weekend. Participants are supposed to team up before the day. I didn’t manage to find a team beforehand and end up meeting 2 wonderful teammates at the venue. One Singaporean, one Japanese. They want to work on Amazon Echo and luckily I bring one. Match made in heaven.

GitHub Open Source Hack 2016

Hackathon The weekend is enjoyable. Martin (My roommate in SF) and I join the Open Source Hack organized by GitHub at GitHub HQ in SF. Pretty fun. The venue is great. The pool table is my favorite. Great food all the time. However, the atmosphere is more like “just for fun” or educational than competitive, which is kind of different from what I expected. Idea The guy next to us has an Amazon Echo.