# Merlin: deep learning framework for Julia

`Merlin`

is a deep learning framework written in Julia.

It aims to provide a fast, flexible and compact deep learning library for machine learning.

`Merlin`

is tested against Julia `0.6`

on Linux, OS X, and Windows (x64).

## Documentation

## Requirements

- Julia 0.6
- g++ (for OSX or Linux)

## Installation

`julia> Pkg.add("Merlin")`

## Quick Start

Basically,

- Wrap your data with
`Var`

(Variable type). - Apply functions to
`Var`

.

`Var`

memorizes a history of function calls for auto-differentiation. - Compute gradients if necessary.
- Update the parameters with an optimizer.

Here is an example of three-layer network:

`Merlin`

supports both static and dynamic evaluation of neural networks.

### Dynamic Evaluation

```
using Merlin
T = Float32
x = zerograd(rand(T,10,5)) # instanciate Var with zero gradients
y = Linear(T,10,7)(x)
y = relu(y)
y = Linear(T,7,3)(y)
params = gradient!(y)
println(x.grad)
opt = SGD(0.01)
foreach(opt, params)
```

If you don't need gradients of `x`

, use `x = Var(rand(T,10,5))`

where `x.grad`

is set to `nothing`

.

### Static Evalation

For static evaluation, the process are as follows.

- Construct a
`Graph`

. - Feed your data to the graph.

When you apply `Node`

to a function, it's lazily evaluated.

```
using Merlin
T = Float32
n = Node(name="x")
n = Linear(T,10,7)(n)
n = relu(n)
n = Linear(T,7,3)(n)
@assert typeof(n) == Node
g = Graph(n)
x = zerograd(rand(T,10,10))
y = g("x"=>x)
params = gradient!(y)
println(x.grad)
opt = SGD(0.01)
foreach(opt, params)
```

When the network structure can be represented as *static*, it is recommended to use this style.

## Examples

### MNIST

- See MNIST

### LSTM

This is an example of batched LSTM.

```
using Merlin
T = Float32
a = rand(T,20,3)
b = rand(T,20,2)
c = rand(T,20,5)
x = Var(cat(2,a,b,c))
lstm = LSTM(T, 20, 20) # input size: 20, output size: 20
y = lstm(x, [3,2,5])
```

More examples can be found in `examples`

.