My First Lambda Container
Container support for Lambda was announced at Re:Invent 2020. At first, I didn’t think much of it. My projects are generally small enough to work fine with a little
npm i. But I have a little side project that could benefit from it: running Spamassassin inside Lambda to tackle SES’s spam tagging failure.
Despite not being very familiar with Docker, I pulled my sleeves back and started on the project. This put me through the typical 4 stages of an AWS Re:Invent announcement:
- 😍 Excitement
- 😞 Disappointment
- 😡 Anger
- 😑 Resignation
Stage 1: Excitement
Being able to use Lambda in my SES flow without calling a third service, what a relief. Building a container for Lambda shouldn’t be that hard…
I just needed a NodeJS Spamc implementation in my function, pass the body of the E-Mail to Lambda et voilà.
I fired up the container locally, threw some messages at it and there it was, all the messages that SES let through where putting Spamassassin’s score into the dark red. I finally had a solution that was cheap (aka pay only when needed) and without any change in my current flow, except adding a
At that point, I had big plans for a v2 with EFS backed storage to share Bayes and manual learning of falsely classified messages. One can dream big…
Stage 2: Disappointment
That’s when I fell from excitement into disappointment. Everything that worked so perfectly locally, just failed in a big ball of fire. My container in Lambda had the same constraints than any function in Lambda: it runs on a read-only system. I was now unable to run
sa-update, unable to bind
spamd to 127.0.0.1:783. No pid files could be written, no log files per services, …
Stage 3: Anger
I naively thought that I could do what I want with my container. No Lambda runtime would go into my way. I was first very angry at AWS for promising me something that didn’t meet my wishes/hopes (I generally always somehow land into this state with a new service).
Then I got angry to myself for not having foreseen the obvious, it is logical that my container runs in read-only. It’s still Lambda.
I started to try to move everything to /tmp, which kind of worked after several trial-and-error. I don’t know how to emulate this in my local Docker, so I had to test online each little tweak. I was still hitting a problem, running spamd on a file socket instead of an io socket, I somehow never got it to work. I wanted to keep spamd, since it gave me a nice output with just the score and the rules.
Stage 4: Resignation
After a few hours, I caved in and let go of my dreams of a fleet of spamd with shared config and rules.
I decided to just run spamassassin and parse the headers on the modified message.
sa-update runs during
image build and logs are sent to stderr. I disabled DCC, hoping that the standard RBL rules in addition to Pyzor and Razor should be a good enough start. Can’t be worse than what SES is doing.
Does it work?
I have this in production since less than 2 days at time of writing, not enough data to make a definitive statement, but so far it behaves very nicely. All the spams are discarded, and no false positive so far. 🤞
The function runs between 10 and 12 seconds and uses between 180 and 220 MB (~0.01¢).
What is still missing
Despite the read-only limitation, which will probably never be lifted, you currently can’t use an image from another account’s ECR. You also can’t use an image published to public ECR. I guess this are just limitations on a brand new service, and should be lifted in the near future. For now, I just need to push my image to the 2 accounts where they are needed. Some CodePipeline would come in handy.
What’s next ?
I could use an EventBridge Scheduler to rebuild the image weekly to have the rules up-to-date. Which anyway is a better solution than pulling them at every start.
Else, I had thoughts about running this on Fargate before, but I didn’t insist enough on trying to get down to 0 when there is nothing to process. AWS batch could be another contender, now that it runs on Fargate.
Lambda Container won’t become my major way of deploying Lambda. I just find the hassle of building, deploying, testing your container more troublesome than an
sls invoke local and
But I can see why this will appeal to some peoples, specially when you need tools compiled that you can’t compile on your development machine (sharp comes to mind), or if you can’t just
go get in your dependencies.