Erlang as a binary sed #howto
Thu 17 Dec 03:31:19 GMT 2020
I got my hands on a cute little CTF artefact, a file named AnExtraBitForEveryByte.
Its hexdump didn't look too promising (to me, since I'm not too experienced in CTFs):
00000000 19 b6 61 ce e9 06 46 2c d8 39 cd 87 2a d6 19 98 |..a...F,.9..*...|
00000010 b3 5a c5 8e 5c 4b 70 e4 c6 c5 6b 06 4d 19 61 c6 |.Z..\Kp...k.M.a.|
00000020 c5 93 06 0c 19 0a |......|
00000026
My first hypothesis was that to get ASCII from it, I should either drop last bit or first bit of every byte.
I lost a metric ton of time trying to use hex editors to do anything meaningful, and my verdict is the following:
hex editors are just interactive hex viewers.
Furthermore, many of them were crashing during usage, causing even more rage. Sadly I didn't document the worst offenders, but I remember bless hex editor outright not launching, wxhex crashing, Gnome being just bad.
Gladly, that wasn't my first rodeo when it comes to working with binary data. I've implemented a full Bitcoin blockchain parser from scratch in 2013 and I did it... (you can guess it if you know me at all)... using BEAM languages.
So I figured I'd spin trusty Erlang virtual machine and let it do hex editting.
Anatomy of Erlang bitstrings
Erlang bitstrings are very flexible, but 99.9999999% of real world usecases will be covered by
- X:BitSize syntax.
Which as expression (RHS) means “take X” and span it over “BitSize” bits, and as a pattern (LHS) means “match BitSize bits to X, if X is value” or “bind BitSize bits to the X if X is identifier”.
- Bys/binary syntax.
Which as expression inserts bytes into a binary variable, and as a pattern is an equivalent of binding the tail of a list.
- Bis/bitstring syntax.
This is a somewhat overlooked syntax that does the same thing as /binary except it has a per-bit granularity instead of per-byte.
- <<...>> syntax. To use three syntaxes above, you must enclose statements in double-triangle-brackets. For example Pad5 = <<0:5>>.
Don't worry if that explanation was too dense, let's go step-by-step and have a look at how to use all of it in practice.
Looping over bytes
Let's begin with simply looping over bytes.
As the syntax summary above suggests, we'll need to use X:8 syntax and Bys/binary syntax.
Download the binary sample: wget https://ctf.cdn.doma.dev/AnExtraBitForEveryByte, open a text editor and create a file called “hello_binary.erl” with the following contents:
-module(hello_binary). % this has to match your file name
-compile(export_all). % exporting everything is a bad practice for production but OK for scripts
% We use recursion for loops.
%
% You can think of it as a while loop that will go on until
% the thing we loop over is consumed entirely.
%
% The results of the loop body will be passed through in
% "accumulator" variable, conventionally called "Acc".
%
% Don't worry about polluting call stack with redundant
% function calls, most of functional programming languages
% have "tail call optimisation"!
idbyte(<<>>, Acc) -> Acc; % If 1st argument is consumed, return Acc
idbyte(<< X:8, Rest/binary >>, Acc) -> % Bind first byte to X and the rest of bytes to Rest variables
% Continue looping over the rest of the bytes:
idbyte(Rest,
% Construct accumulator.
% It begins with what we have already, but with matched byte in the end:
<< Acc/binary, X:8 >>).
main() ->
{ok, In} = file:read_file( "AnExtraBitForEveryByte" ),
% Erlang has a peculiar printf, all you need to remember is that
% % "~n" is new line
% % "~p" is a pretty-printing format directive
% % it takes a _list_ of things to be printed
io:format("~p~n", [In == idbyte(In, <<>>)]).
Alright, now let's compile and run it, all through Erlang's interactive shell:
λ wget https://ctf.cdn.doma.dev/AnExtraBitForEveryByte
λ erl
Erlang/OTP 22 [erts-10.6.4] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1]
Eshell V10.6.4 (abort with ^G)
1> c(hello_binary).
hello_binary.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,hello_binary}
2> hello_binary:main().
true
ok
How wonderful! Our idbyte function is indeed an identity function for byte-aligned binaries.
Now, in the same interactive session, let's see what will happen if we give it just seven bits:
3> {ok, In} = file:read_file( "AnExtraBitForEveryByte" ).
{ok,<<25,182,97,206,233,6,70,44,216,57,205,135,42,214,25,...>>}
4> << X:7, Rest/bitstring >> = In.
<<25,182,97,206,233,6,70,44,216,57,205,135,42,214,25,152,...>>
5> hello_binary:idbyte(X, <<>>).
** exception error: no function clause matching hello_binary:idbyte(12,<<>>) (hello_binary.erl, line 16)
As expected, it didn't work. To work with pure bits, we need to use bitstrings, as is shown in interaction number 4.
With that knowledge, we can attempt to write our function that drops last bit of every byte.
Add the following functions to your file:
drop_last_bit_of_every_byte(<<>>, Acc) -> Acc;
drop_last_bit_of_every_byte(<< A:7, _B:1, Rest/bitstring >>, Acc) ->
drop_last_bit_of_every_byte(Rest, << Acc /bitstring, A:7 >>).
main1() ->
{ok, In} = file:read_file( "AnExtraBitForEveryByte" ),
Out = drop_last_bit_of_every_byte(In, <<>>),
io:format("Dropped extra bits! ~p(~p)~n", [Out, bit_size(Out)]),
file:write_file("AnExtraBitForEveryByte.out", Out).
Let's compile and run!
λ erl
Erlang/OTP 22 [erts-10.6.4] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1]
Eshell V10.6.4 (abort with ^G)
1> c(hello_binary).
hello_binary.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,hello_binary}
2> hello_binary:main1().
Dropped extra bits! <<25,109,134,126,128,209,150,216,115,52,50,186,198,76,178,
183,20,117,201,92,114,199,137,168,52,195,24,99,197,36,24,
97,129,1:2>>(266)
{error,badarg}
That's a little bit cryptic. What that means is that file:write_file returns {error, badarg} instead of ok.
Returning {error, ErrorTag} is a standard practice for Erlang (which is a pattern that might be familiar to Go programmers).
If we think a little bit, we'll realise that it's logical.
Thanks to bit_size/1 call we know that our binary is 266 bits big.
Since remainder of division of 266 by 8 is not 0, we can't write this binary into a byte-aligned file.
Let's fix this issue by padding resulting binary with zeroes.
Add the following functions to your file:
pad_with_zeros(<< X/binary >> ) -> X;
pad_with_zeros(<< X/bitstring >>) ->
Padding = 8 - (bit_size(X) rem 8),
<< X/bitstring, 0:Padding >>.
main2() ->
{ok, In} = file:read_file( "AnExtraBitForEveryByte" ),
Unpadded = drop_last_bit_of_every_byte(In, <<>>),
Out = pad_with_zeros(Unpadded),
file:write_file("AnExtraBitForEveryByte.out", Out).
Let's run it again, and then hexdump -C it in hopes to get the flag.
Eshell V10.6.4 (abort with ^G)
1> c(hello_binary).
hello_binary.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,hello_binary}
2> hello_binary:main2().
ok
3>
BREAK: (a)bort (c)ontinue (p)roc info (i)nfo (l)oaded
(v)ersion (k)ill (D)b-tables (d)istribution
a
Thu Dec 17 04:03:46:856264198 sweater@timetwister /tmp
λ hexdump -C ./AnExtraBitForEveryByte.out
00000000 19 6d 86 7e 80 d1 96 d8 73 34 32 ba c6 4c b2 b7 |.m.~....s42..L..|
00000010 14 75 c9 5c 72 c7 89 a8 34 c3 18 63 c5 24 18 61 |.u.\r...4..c.$.a|
00000020 81 40
Welp, we didn't get the flag, but at least we have most of the tools needed to implement binary stream editing with Erlang!
And we can validate that our padding with zeroes worked by observing that 0x40 = 64, which is exactly one with six zeroes in binary.
Bonus content: how to get the flag after all
With a simple query we can learn, that the most popular 7-bit format is the 7-bit ASCII.
Now all that is left is to obtain a 7-bit ASCII table and try decoding.
By far the fastest way to do it is to pipe HTML containing the table through w3m pager, getting clean, tab-separated content, and then write a VIM macro to transform it into a data structure.
Sadly, I didn't record myself doing that, but you'd better believe that it took me well under five minutes.
I chose to python as the target language, don't really know why, must be its reputation as the language of choice for doing hacky write-only stuff in post-Perl world. Anyway, I ended up with the following code to attempt 7-bit ASCII decoding:
def decodeBigEndian(data):
x = int.from_bytes(data, "big")
buff = ''
flag = ''
for i in "{0:b}".format(x): # Format python int as binary
buff = buff + i # Build up string of 1s and 0s
if len(buff) == 7: # When it's 7-long, look-up character and reset the buffer
flag = flag + zoom_char(find_binary(ascii7, buff)[0])
buff = ''
return flag
Indeed, when we run the script, we get a valid UUID, which must be the flag:
λ python3 7bitAscii.py
7-Bit-ASCII representation:
flag: d1309fae-0f13-11eb-adc1-0242ac120002Ctrl- J NL Line Feed
That's it for now. If you have any questions or feedback on the article, shoot me a toot on Mastodon.
Speaking of Mastodon! Registration is still open on our server. Make sure to grab an account both for fun and since we plan exclusive offers for active members of Doma community!
Infrastructure and development
Wed 2 Dec 21:50:57 GMT 2020
Hello, world! In this post we are presenting operations update for November.
Table provided is a summary of services Pola and Jonn had to set up for operation and time spent on setting up each of those.
Everything is Ubuntu LTS and configuration/artefact-managed.
Pola spent some time learning intricacies of react-native-web, while Jonn spent most of the time on ops work and
right now he's deploying keycloak and is figuring out how to make a socketio impression with Elixir plugs.
Pola also found BorgBackup tool, so that's on our list for backups (which
we, admittedly, don't have in a proper fashion)
In terms of development, we have made considerable proceedings in studying recent changes to React, its capabilities and limitations of ultra-cross-platform development.
We have also checked out Recoil and it seems really great tool that feels like bringing react/om vibes to the JS world while also being sort of actor-model-y. We like it very much and can't wait to battle-test it!
We still don't know how to deliver React-Native applications to the marketplaces, but we have a good vision of what we need to do to make a complete, useful project for the people.
Given the crowd we are interacting with, it feels like it'll be fun for everyone.
Importantly, that means that we are getting closer and closer to our goal of being able to perform and consult on rapid product development.
Conversely, we have found out that the state of affairs of AAA systems is very poor.
A decision has been made to postpone our fun little initial product development in favour of implementing a DID method that can trivially work in a self-hosted/federated setting.
We don't aim to write a keycloak-sized framework off the bat, but we aim to begin with providing well-documented and easy way for startups and companies to give their users a way to authenticate with a DID.
After this is done and rolled out on our servers, we will see how to expand implementation to:
- Enable federated setting
- Integrate our solution with the user-facing services we deploy like Peertube and Mastodon
- Start the process of upstreaming our patches to said services
That's it for our November report, stay tuned for more content.
Doma: stoicism incorporated
Sat 21 Nov 09:40:36 GMT 2020
What is this doma thing all about? Here is the gist.
TL;DR
Doma is a prinicpled product-oriented company.
Its products are privacy-enabled and open-source, deployments are subscription-based.
At the time of writing, doma is making their first full-stack product.
It will result in rapid prototyping SDK to make it easier for businesses to reinvest into themselves.
Doma is available for consulation for FP and data science.
Principles
Humans matter
-
Short production cycles:
we aim to have exactly one mythical person-month for full development.
It is then followed by another two weeks for measuring usefulness.
-
No useless work:
Even early adopters of our stuff will have to commit some sort of resource to using our products and tools.
The default approach would be a subscription for a nominal price or unsupported self-hosting.
-
Go where the users are:
we won't make users come to us.
Instead, we will do our best to fit their preferences and workflows.
Privacy matters
-
Data storage ethics:
We are not selling or disclosing data, metadata or run “opt-out” targetted marketing.
And we never will.
-
Responsible deployments on less-private platforms:
there is a world on the other side of the garage doors.
Whilst deploying on popular platforms, we commit to posting noticable messaging and promoting free and non-opressive alternatives to said platforms.
-
Warrant canaries:
Each of our products that stores user data on its servers shall have warrant canaries.
Business plan
-
Build rapid prototyping toolchain:
after we're done slow-rolling a full-stack project currently in development, we will have a reliable way to build typed, performant products, targetting every platform.
-
Prove its feasibility:
build another product, without spending more than a person-month on it.
-
Turn toolchain and processes into a software kit:
my current main company has the linear business model problem, which is typical for a consultancy.
To address that issue, we are actively trying (and somewhat failing) to enter product market.
It seems reasonable to me that many businesses that want to get better profit asymptotics would want to self-incubate products.
Our proven kit should allow them to have bounded risks at least as far as software development and marketing are concerned.
-
Consultancy:
when me and my partner aren't doing stuff on our two jobs (that's two jobs per each person, not two jobs in total), we're ready to consult on the topics of
- functional programming
- haskell
- erlang
- elixir
- rust
- maximising utilisation of success and structural typings
- data science
- forecasting
- statistical analysis
- data visualisation
- machine learning
Outside the garage
-
Ready to scale:
we plan to keep on making product prototypes until we get validation from a broader community.
When that happens, we are going to put resources into the expansion.
-
Conventional marketing but ethical:
once we have a validated product, we will commence conventional marketing rounds, perhaps using a contractor.
We will be careful to craft the messaging in such way that the person who gets targetted by our marketing campaigns
knows that we aren't happy to give our money to platforms that are conventionally used to run marketing campaigns on.
Jonn
Improvements to usability of the website
Mon 16 Nov 13:40:36 GMT 2020
A couple of days ago, we talked about setting up our grid-based microblog layout with just a few lines of CSS.
Overall, it was a successful and very rapid development, but since we have started producing content, it turned out to have some UX problems.
A particularly tricky thing was to get code listings right (but what else is new).
What we ended up doing to prevent it messing up the width allocation of grid is demonstrated below.
code {
/*** (1) ***/
white-space: pre-wrap;
/***********/
/*** (2) ***/
overflow-x: auto;
/***********/
display: block;
border: 1px solid #ccc;
padding: 0.666em;
}
Important bits here are:
- (1), which makes sure that whitespace is preserved except that long lines are wrapped.
- (2), which makes sure that if we can't wrap a line, we'll denote horizontal overflow by displaying a scrollbar.
Another concern was that the articles tend to have different lenghts.
To make sure that the grid is displayed nicely, without huge vertical gaps, we have measured an average display size of a microblog on our webpage and added the following two rules in the global CSS file:
article {
max-height: 640px;
overflow-y: auto;
}
The final touch was to make sure that when the website is viewed on a smaller device, the max-width property is unset:
@media only screen and (max-width: 900px) {
article {
max-height: unset;
overflow-y: unset;
}
.index__outer {
display: grid;
grid-template-columns: 1fr;
}
}
The last thing on the backlog—albeit rather far down the priority list—is making anchors to the articles and creating an RSS feed.
But for now, we are content with being fairly certain that our website doesn't make readers' eyes bleed.
If you feel otherwise, please contact us at hi ат doma · dev and tell us what you think.
Doma to use patch workflow
Fri 13 Nov 07:53:49 UTC 2020
Patch workflow is practically superior to pull request workflow.
Sice we have decided to use Sourcehut, we are going to benefit from it!
Set-up could be a tad daunting, so here's a checklist for those who want to adopt it too.
- Learn git diff (creates a patch) / git apply (applies a patch)
- Learn git send-email (sends a patch over E-Mail using SMTP protocol)
- Select your default GPG key (generate one with gpg --full-generate-key)
- Configure mutt
- Set up fetchmail (your login / password go to .netrc)
- Set up procmail
Set it up, play around with it a little bit, and you should be good to go. And don't hesitate to learn!
If you're a seasoned Github user, remember that you had to learn gitflow and other workflows imposed by Github
If you're new to distributed revision control systems and collaborating with others on code, this should be approximately as hard or easy to get into as the alternatives.
Don't fear, and if you have questions — reach out to the mailing list[1] or me[2] over E-Mail, I'll do my best to answer your questions.
[1]: ~sircmpwn/sr · ht-discuss ат lists · sr · ht
[2]: jonn ат doma · dev
Set up grid layout
Fri 13 Nov 05:05:29 UTC 2020
We have migrated to a simple grid layout.
It's not 100% that we'll keep using it for our website, but for a pretotype, it feels like a good fit.
It addresses the following problems in a clean sweep:
- Responsive, mobile-first websites
- Good enough fit for bite-sized announcements
- Design of “product” and “team” pages should be trivial
Show me the code
Omitting media-queries, here it is:
.index__outer {
display: grid;
grid-template-columns:
1fr 33% 33%;
}
.index__headline{
grid-column-start: 1;
grid-column-end: 3;
}
There was no additional code needed for index__item's to display correctly.
Media-queries just make the grid smaller by rewriting grid-template-columns property.
Also, when grid shrinks to 1-wide, we needed to rewrite grid-column-end of index__headline for it to not mess up the following line.
A pretty decent logo
Thu Nov 12 17:16:30 UTC 2020
Remade the company's logo, here it is:
We were going for minimalist approaches, but it's really hard to attain.
Our first version was similar in colourscheme, but a tad more cluttered and chaotic.
This logo went through several iterations but still was made following pretotyoing philosophy.
Although logos change, we think that our cute little design session with a Remarkable tablet and Inkscape, the FOSS Corel Draw clone, will stay in our memories for a long-long time.
Migration from WSL2
Tue Nov 11 22:11:25 UTC
We have moved servers from WSL2 to DigitalOcean in under ten minutes
thanks to our DevOps project called “archive-trap”
Stay tuned for more cool stories from doma.dev team.