Skip to main content

TensorFlow in Python3

TensorFlow is an Open Source library that Google has released earlier this month. It allows to, in a simple manner, arrange processing and training flows, with elements like neural networks, and even implement new operations over it's architecture (tutorial and examples).

This library is written in C++, but the architecture, the data to be managed and the operations are declared in Python. This is great, as it yields a great performance without having to deal with Segmentation Faults, but if you were expecting to use Python3 for this... you may have to wait a while, at this moment it's not supported [tensorflow GitHub issue #1], but it's support is planned.

Meanwhile, in this repo, in the python3 branch, you have a way to use it... it's not completely updated and there's things to tune, like checkpoints. Also, you'd have to build it manually (there's no prebuild pip package), but we'll see that can be done easily.

Read more…

El Julia set en un shader

Desde que empezó el tema de los smartphone, siempre sonó interesante la idea de programar desde el propio cacharro. Si bien ya se podía desde (casi?) el principio con SL4A, nunca fué algo demasiado cómodo.

Pues bien, resulta que en F-Droid (market de aplicaciones libres) tienen un entorno que permite programar Shaders en GLSL, programas que permiten generar gráficos desde la GPU, e incluso utilizarlos como fondo de pantalla, Shader Editor.

El programilla es bastante sencillo, y parece un buen método para aprender a hacer shaders, sabiendo por ejemplo C, y a partir de los ejemplos que incluye.

Pues bien, ahí va algo programado cacharreando con esto en el tren, dibuja el conjunto de Julia, moviendose en un par de dimensiones para que quede algo dinámico.

Read more…

Un algoritmo de búsqueda de elementos similares

Es curioso, hay momentos en los que uno tiene que buscar una solución a un problema sencillo, por ejemplo, dadas varias listas de elementos, buscar la más parecida a otra nueva, y encontrar una solución (muy simple!). Pero esta solución no aparece en ningún otro sitio, a alguien se le tuvo que ocurrir! será tan simple (y tan inferior a otras), que no merece la pena documentarla?, simplemente uno no es capaz de encontrarla?... es probable :P...

Después de hacer alguna prueba más... resulta que no escala bien, y con grandes datos pierde ventaja rápidamente xD

Bueno, siendo como fuere, ahí va un algoritmo para buscar la lista (o listas), mas cercana a una dada, sin tener que comparar todos los elementos de todas.

La utilidad es bastante directa, en el campo de la IA (Inteligencia Artificial) hay una serie de algoritmos para hacer clasificación, dado un conjunto de entrenamiento etiquetado (con cada elemento asignado a una categoría) encuentra al conjunto al que pertenece un nuevo elemento.

Read more…

Evolucionando decoders [1]: Brainfuck

Ya va casi un año desde el último post, como pasa el tiempo...

Esta época he estado liado con varios proyectos, he acabado mi Trabajo de Fin de Grado, del que intentaré hablar más adelante, y he participado en algún CTF. Algo que he notado es que en lo que se refiere a pruebas criptográficas suele haber dos tipos, en las que el algoritmo está claro desde el principio y hay que atacarlo. Y en las que se da un texto cifrado y se plantea el reto de obtener el flag que hay en el.

La idea detrás de este segundo tipo de pruebas (supongo) es determinar la capacidad de reconocer similitudes con cifrados ya existentes, de realizar un análisis de los datos (entropía, ...) y extraer conclusiones a partir de ahí. Hay gente que es muy buena haciendo esto...

Yo no.

Otra opción es hacer pruebas hasta que vaya apareciendo algo interesante, pero es un proceso largo y que no necesariamente da frutos, sería interesante poder automatizarlo, verdad?

Read more…

Writeup de inBINcible [NcN CTF Quals]

Lleva un tiempo el blog parado, pero traigo algo interesante, veamos como reversear un binario de los presentados en las quals del CTF de la No Con Name.

Aviso: Soy bastante novato en esto, así que seguro que se podrían obviar muchos pasos o hacer más sencillo con los conocimientos y herramientas adecuadas. Si sabes algo de esto, mejor ve ya a la parte curiosa ;).

El binario en cuestión es “inbincible”, si lo ejecutamos produce el siguiente resultado:

1
2
3
$ ./inbincible
Nope!
$

Obviamente no es el que nos interesa, así que veamos lo que hace, abrimos con gdb y buscamos una función desde la que empezar

Read more…

Extracting .mkv subtitles

Actually this was already posted but it was lost on some migration... so here it is again

All the parsing and extraction is implemented by mkvtoolnix, so first step is installing it...

1
sudo apt-get install mkvtoolnix

After this we can see the tracks in the file

1
mkvinfo video.mkv

Read more…

Making MySQLdumps more friendly

Some time ago I had to work with some MySQL database dumps generated by mysqldump(1), lacking a version control software (which fortunately hasn't been needed) more specific, the one used was git(7). Now, git allows to make diff across versions, but this (at least by default) is made line by line so mysqldumps get a lot of data changes even if only a row is the one changed to solve this issue this program was written sqlsplit.c.

The program isn't too polished, it has a main function that only opens the file and another which (with the help of two macro *_*) simulates something like a state automata (actually with a stack), the compilation is simple

1
gcc sqlsplit.c -o sqlsplit

So, for example, if the imput where

1
2
3
4
5
set autocommit=0;
INSERT INTO `input_table` VALUES (1,'Title 1','File 1','Type','NULL','Date 1'),(2,'Another title','Another file, too','Type, you know','Language','9999'),(3,'A third title','File with \' too.heh','Some, types','NULL','Tomorrow');
/*!40000 ALTER TABLE `input_table` ENABLE KEYS */;
UNLOCK TABLES;
commit;

Doing this...

1
./sqlsplit intput.sql output.sql

We'd get something more legible

1
2
3
4
5
6
7
8
set autocommit=0;
INSERT INTO `input_table` VALUES
    (1,'Title 1','File 1','Type','NULL','Date 1'),
    (2,'Another title','Another file, too','Type, you know','Language','9999'),
    (3,'A third title','File with \' too.heh','Some, types','NULL','Tomorrow');
/*!40000 ALTER TABLE `input_table` ENABLE KEYS */;
UNLOCK TABLES;
commit;

And thats all, of course if we'd want to take or show data on the terminal (for example to take or return compressed files) the files to use would be /dev/stdin for the input and/or /dev/stdout for the output.

Using andEngine from emacs

I was trying to test some game programming in android, a good looking library is AndEngine, the tutorials I found were Eclipse centered, but after trying and ending a couple of times with a segfault, importing a project! it's time to go back to the classics, so let's see how to do it with emacs.

Read more…

Migrando ownCloud de MySQL a SQLite

En su momento realicé una instalación de ownCloud utilizando MySQL como base de datos, más tarde se hizo obvio que esta no era la opción correcta y que la necesidad de ahorrar toda la memoria RAM posible y el hecho de que no hubiera más accesos que los míos apuntaron a que debía haber optado por SQLite, el proceso es algo complicado la primera vez así que aquí queda explicado por si hay que repetirlo...

Lo primero es convertir la propia base de datos a SQlite, idealmente esto supondría sacar un mysqdump, pasárselo a SQLite y el estándar del lenguaje haría el resto...

Pero esto no es tan sencillo, resulta que hay incompatibilidades entre estos dos dialectos y resolverlas a mano requeriría un tiempo del que probablemente no dispongamos, para esto podemos recurrir a sequel, con Ruby y las librerías de desarrollo de los clientes de MySQL y SQLite instaladas podemos conseguirlo haciendo

1
2
3
gem sqlite3
gem mysql
gem sequel

Una vez instalado para convertir la base de datos solo habría que hacer

1
sequel 'mysql://db_username:db_pass@db_host/db_name' -C "sqlite://db.sqlite"

Y tendremos la base de daños migrada a SQLite en db.sqlite, solo falta la configuración.

En el directorio de instalación de ownCloud hay uno llamado config, dentro, el archivo a editar para pasar a la nueva base de datos es config.php, el significado de cada línea se puede ver en config.sample.php, concretamente habría que cambiar las líneas "dbtype" de mysql a sqlite.

Por último queda añadir la nueva base de datos a la carpeta data con el nombre que tomaba la base de datos y la extensión .db y listo, tras asignar los permisos para que ownCloud pueda acceder y modificar el archivo ya podemos usar de nuevo la plataforma.

Writting an Erlang port

This quarter we had an subject with a an assignment to be developed in Erlang, a functional language which oriented to concurrent programming and interprocess communication through message passing. The result is a crawler where each domain has an assigned “thread” which has to make the requests to the web server, plus another one to download the images and index them using pHash, the program is composed of more parts but now we'll center in this.

(By the way, the project has been developed on the open, the code is available at it's GitHub repository, EPC).

At the beggining each thread simply made a call httpc:request, which is the way that the standard library offers to make this requests, but it seems that the concurrency is not very well handled, this produced starvation on the indexation process.

Further down on the specification a possible solution is shown:

Option (option()) details:
sync
     Shall the request be synchronous or asynchronous.
     Defaults to true.

Anyway, at that moment that solution wasn't checked, instead another two where implemented, one was a dedicated process which will make sure that the indexer downloads had priority and it won't suffer starvation, this won't happen to the crawlers because it'd take some time to the indexer to obtain the features of the downloaded image (not much, but some time), this was implemented in pure Erlang and it was the one merged in the master branch.

Another option was to implement the download as a port, an external program written in C and that will be called from an Erlang process, this possibility was kept in the GET-by-port branch.

C - Erlang Communication

The port is composed of two components, the C part an the Erlang part, the communication can be made in multiple ways, and is defined along with PortSettings on open_port, the possibilities are

  • {packet, N}

    Messages are preceded by their length, sent in N bytes, with the most significant byte first. Valid values for N are 1, 2, or 4.

  • stream

    Output messages are sent without packet lengths. A user-defined protocol must be used between the Erlang process and the external object.

  • {line, L}

    Messages are delivered on a per line basis. Each line (delimited by the OS-dependent newline sequence) is delivered in one single message. The message data format is {Flag, Line}, where Flag is either eol or noeol and Line is the actual data delivered (without the newline sequence).

    L specifies the maximum line length in bytes. Lines longer than this will be delivered in more than one message, with the Flag set to noeol for all but the last message. If end of file is encountered anywhere else than immediately following a newline sequence, the last line will also be delivered with the Flag set to noeol. In all other cases, lines are delivered with Flag set to eol.

    The {packet, N} and {line, L} settings are mutually exclusive.

In this case we'll use {packet, 4}, enough to send whole web pages.

Communication - The C side

Lo let's focus in what happens in the program written in C when it receives the data, the function which manages this is char* read_url()

The process is simple, read 4 bytes from stdin and save it as a uint32_t

1
2
3
4
uint32_t length;
if (fread(&length, 4, 1, stdin) != 1){
    return NULL;
}

The it converts the data from big-endian to the host endianness, this is done through the function ntohl()

1
length = ntohl(length);

The rest is simply read a string from stdin, without any transformation and knowing it's length

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
char *url = malloc(sizeof(char) * (length + 1));
if (url == NULL){
    return NULL;
}

unsigned int already_read = 0;
while (already_read < length){
    already_read += fread(&url[already_read], sizeof(uint8_t),
                          length - already_read, stdin);
}
url[length] = '\0';

Returning the data to Erlang doesn't need much more effort, as it can be read in the void show_result(headers, body) procedure. The data is sent in two groups, first the headers and then the result body, sending the headers means converting it's size to big-endian using htonl and writting it to stdout, to then write the whole string directly

1
2
3
4
5
6
7
8
9
/* Strange things can happen if we forget this */
uint32_t headers_size = htonl(headers.size);
fwrite(&headers_size, 4, 1, stdout);

unsigned int written_head = 0;
while (written_head < headers.size){
    written_head += fwrite(&(headers.memory[written_head]), sizeof(uint8_t),
                           headers.size - written_head, stdout);
}

... and repeat this for the response body

1
2
3
4
5
6
7
8
uint32_t body_size = htonl(body.size);
fwrite(&body_size, 4, 1, stdout);

unsigned int written_body = 0;
while (written_body < body.size){
    written_body += fwrite(&(body.memory[written_body]), sizeof(uint8_t),
                           body.size - written_body, stdout);
}

This is all it takes to complete the interface with Erlang, the rest is common C logic, in this case would be to make HTTP requests, something easy using cURL, and each time we may want to receive a URL from Erlang we'll only need to call read_url(), and of course the compilation is made the usual way.

Communication - The Erlang side

The logic managed for the Erlang side isn't complex either, it's only to call open_port, which returns the PID where to send the data, assuming that we have defined HTTP_GET_BINARY_PATH with the path of the compiled binary that completes the port

1
Port = open_port({spawn, ?HTTP_GET_BINARY_PATH}, [{packet, 4}])

When sending the data to the binary is needed, it should be sent through this PID, sending a tuple like

1
{Actual_process_PID, {command, Message}}

For example

1
Port ! {self(), {command, Msg}},

This is converted to the data which the binary receives, in the same way when it sends data back it'll be received as a message like

1
{Port_PID, {data, Received_message}}

Since we're receiving two, the headers and the message body...

1
2
3
4
5
6
7
8
9
receive
    {Port, {data, Headers}} ->
        receive
            {Port, {data, Body}} ->
                From ! {self(), {http_get, {ok, {200,
                                                 process_headers(Headers),
                                                 Body}}}}
        end
end,

The result is sent as a message, this is because it's made to run inside a loop and keep the port active in an isolated process until the process which created it terminates, but it's not a need to use it this way, there's no reason it couldn't return the result directly as a “normal” function, though ¿maybe? it can have problems coordinating the access to the input/output if multiple process are using it.

... and that's all, we have a piece of C code running from Erlang, of course since the interface is stdin/stdout any language can be used, it's a very flexible design :)