Press "Enter" to skip to content

Tag: julia

Julia: Piled-Like Matrix Multiplication on Memory-Limited GPUs Using MXNet

2D Matrix Multiplication on GPUs with Limited Memories

There was a time when I imagined feeding my big big data to the graphics cards smoothly, but Mom told me that is impossible because my data stream is much too big and the memory of  the graphics cards is limited. The memory is to a graphics card what capacity to a human brain.

But we have hard drives and CPU memories which are a lot larger than GPU memories. Using CUDA one can perform such operations like slicing matrices into different parts then calculating sub-matrices multiplications once at a time on GPU and composing the little matrices into the final result, so as to ease the burden when the matrices are too big to load on the GPU memory at a time, which is just what this article introduced.

The so-called "piling" operation in 2D matrix multiplication.
The so-called “piling” operation in 2D matrix multiplication.

Continue reading Julia: Piled-Like Matrix Multiplication on Memory-Limited GPUs Using MXNet

Julia: Solving A Package Related LoadError

I was determined to use Julia on some bigger and more interesting coding missions several days ago. While the official Package Manager seems not always satisfying, and there are some little obstacles when adding a package. Wherever the problems come from, some confusions must be resolved before the next step.

LoadError When using A Package

Yes, it’s the NeareastNeighbors.jl. Normally you just need to type

julia> Pkg.add("NearestNeighbors")

And the package will automatically be added and all the “dependencies” will be solved.

Yes, of course it did. But when you are actually using it, the problem occurred.

julia> using NearestNeighbors
WARNING: could not import Distances.eval_reduce into NearestNeighbors
WARNING: could not import Distances.eval_end into NearestNeighbors
WARNING: could not import Distances.eval_op into NearestNeighbors
WARNING: could not import Distances.eval_start into NearestNeighbors
ERROR: LoadError: LoadError: UndefVarError: UnionMetrics not defined
 in include at ./boot.jl:261
 in include_from_node1 at ./loading.jl:304
 in include at ./boot.jl:261
 in include_from_node1 at ./loading.jl:304
 in require at ./loading.jl:243
while loading /home/joseph/.julia/v0.4/NearestNeighbors/src/evaluation.jl, in expression starting on line 11
while loading /home/joseph/.julia/v0.4/NearestNeighbors/src/NearestNeighbors.jl, in expression starting on line 33

Well, it seems there’s something wrong with the package. It said it needs the latest Distances.jl package on its site, and it suggest us to checkout the latest version of Distances.jl. It’s not new enough!

julia> Pkg.checkout("Distances")

And nothing changes. I tried again and it still failed with the same error. After some analysis and experiment I found the problem lies in the version of the Distances.jl package.

How to Make It “Right”

Continue reading Julia: Solving A Package Related LoadError

Bricks on Julia

Writing something I found when coding, like some kind of syntax sugars… though no guarantee for the correction…

Initializing An Array

There are different kinds of array initializations. Notice the differences between “{}” and “()” notations.

# Array Initialization
a = Array{Int64,1}() # 0-element Array{Int64,1}
b = Array(Int64,0) # 0-element Array{Int64,1}
c = Array(Int64,1)
# 1-element Array{Int64,1}:
# 6874584755139077376
d = Array(Int64,2)
# 2-element Array{Int64,1}:
# 140298281045760
# 140298275170800
e = Array(Int64,2,2)
# 2x2 Array{Int64,2}:
# 140298302013152 140298289255248
# 140298289255184 140298302013232
f = Array{Int64,2}(2,2)
# 2x2 Array{Int64,2}:
# 140298246197808 140298268973680
# 140298268973616 140298268947024

All of them work. Those in parentheses specify how they can be initialized.

How Does Sparse Matrix Work?

The official document doesn’t tell much, so I go to the source code here. There’s a type definition SparseMatrixCSC, which is the basic type of a lot of methods related to sparse matrix. The definition is as follows:

# Type Definition
type SparseMatrixCSC{Tv,Ti<:Integer} <: AbstractSparseMatrix{Tv,Ti}
m::Int # Number of rows
n::Int # Number of columns
colptr::Vector{Ti} # Column i is in colptr[i]:(colptr[i+1]-1)
rowval::Vector{Ti} # Row values of nonzeros
nzval::Vector{Tv} # Nonzero values
end

After some tossing about the codes, I figure out what does all the parameters mean in the SparseMatrixCSC.

  • m: control the number of rows in the matrix
  • n: control the number of columns in the matrix
  • colptr: column pointers, control the starting and ending indexes of every column, usually Array of the shape (n+1,)
  • rowval: row values of every non-sparse element, Array, the size is the same as the number of non-sparse elements, controls which row is a specific element in
  • nzval: values of every non-sparse element, Array, the size is the same as the number of non-sparse elements

In fact, in a sparse matrix like the following:

# a sparse matrix
# 2 x 3 5 x x x 7
# x x x 1 4 x x x
# 7 8 4 5 3 x x x

in which the “x” indicates the element is sparse. The parameters are like:

# parameters of the matrix
# m = 3
# n = 8
# colptr = [1, 3, 4, 6, 9, 11, 11, 11, 12]
# rowval = [1, 3, 3, 1, 3, 1, 2, 3, 2, 3, 1]
# nzval = [2, 7, 8, 3, 4, 5, 1, 5, 4, 3, 7]

The colptr may seem a little hard to understand. In fact, if you iterate the array using only one iterator, you get a sequence which is the actual storing order of the matrix [2, 7, 8, 3, 4, 5, 1, 5, 4, 3, 7], and the column pointers records the first indexes of the value occurring in the column, or the first expected value in the column if there no value in this column. For example, the first expected value of the 1st column is 2, which has an index of 1 in the iterating list, and the first expected value of the 2nd column is 8 which has an index of 3 in the iterating list, and so on. Then we get the list of column pointers.

Notice that there’s no non-sparse element in the 6th column, and next we are expecting the value 7 to occur, then we should fill 11(the index of 7 in the iterating list) in the array of column pointers.

And the rowval is easier to understand when you just need to assign a row index to each element and so does the nzval.

And now we may construct a specific sparse matrix we would like to.

Dispatching A Missions List in One Line

When doing parallel computing, especially when using DistributedArrays, one may always want to dispatch/separate an array into (even) parts. It seems that I couldn’t find a simple way or method achieving that. Let’s say you want M missions dispatched/separated/grouped/splitted to N processes/workers, so the following may help:

# dispatch your mission
M = 50
N = 8
dlist = [collect(i:N:M) for i=1:N]
# [1,9,17,25,33,41,49] 
# [2,10,18,26,34,42,50]
# [3,11,19,27,35,43]   
# [4,12,20,28,36,44]   
# [5,13,21,29,37,45]   
# [6,14,22,30,38,46]   
# [7,15,23,31,39,47]   
# [8,16,24,32,40,48]

And you don’t need to worry about whether M can be divided exactly. But this seems quite “sparse”, what if I want neighboring missions dispatched together to the same process?

# dispatch your mission
M = 50
N = 8
[p[1]+1:p[2] for p in[unshift!([sum([floor(Int64,M/N)+(i<=M%N) for i=1:N][1:j]) for j=1:N],0)[k:k+1] for k=1:N]]
# 8-element Array{Any,1}:
# 1:7  
# 8:14 
# 15:20
# 21:26
# 27:32
# 33:38
# 39:44
# 45:50
[collect(p[1]+1:p[2]) for p in[unshift!([sum([floor(Int64,M/N)+(i<=M%N) for i=1:N][1:j]) for j=1:N],0)[k:k+1] for k=1:N]]
# 8-element Array{Any,1}:
# [1,2,3,4,5,6,7]     
# [8,9,10,11,12,13,14]
# [15,16,17,18,19,20] 
# [21,22,23,24,25,26] 
# [27,28,29,30,31,32] 
# [33,34,35,36,37,38] 
# [39,40,41,42,43,44] 
# [45,46,47,48,49,50]

Just as what I want, but can the code itself be more elegant? Well, perhaps you may find this easier to understand:

# dispatch your mission
M = 50
N = 8
a = [floor(Int64,M/N)+(i<=M%N) for i=1:N] # how many elements in each group
# [7, 7, 6, 6, 6, 6, 6, 6]
b = unshift!([sum(a[1:j]) for j=1:N],0) # initial-1 and ending elements of each group
# [0, 7, 14, 20, 26, 32, 38, 44, 50]
c = [collect(p[1]+1:p[2]) for p in[b[k:k+1] for k=1:N]] # create array between consecutive elements
# c is the demanded grouping scheme 

Well I am thinking of some better and more elegant method, which may take some time. Before that you can wrap the code into a function for convenience.

And I translated it to Python, which writes like:

# dispatch your mission in Python
M = 50
N = 8
L = [list(range(p[0]+1,p[1]+1)) for p in [([0]+[sum([int(M/N)+(i<=M%N) for i in range(1,N+1)][0:j]) for j in range(1,N+1)])[k:k+2] for k in range(0,N)]]

It looks…… Perhaps there are better ways…

Avoid Some Structures When Saving with JLD

You know, some structures should be avoided, like the Dict{UTF8String,Array{Float64,1}}, even if you explicitly declare the types of every components, they are still slow when saving…
Continue reading Bricks on Julia

Collections on Julia

I will update the post sometimes…

Deep Learnings on Julia

DEEP LEARNING AND GPU PARALLELIZATION IN JULIA (pdf)

MXNet.jl

MXNet Documentation: http://mxnetjl.readthedocs.org/en/latest/

Mocha.jl

Knet.jl

THE NEW PHALLS

UNDERSTANDING OBJECT-ORIENTED PROGRAMMING IN JULIA – OBJECTS (PART 1)

UNDERSTANDING OBJECT-ORIENTED PROGRAMMING IN JULIA – INHERITANCE (PART 2)

Blends

The Julia Express: http://bogumilkaminski.pl/files/julia_express.pdf

Map, Filter and Reduce in Julia: https://mmstickman.wordpress.com/2015/01/30/map-filter-and-reduce-in-julia/

What is a “symbol” in Julia? http://stackoverflow.com/questions/23480722/what-is-a-symbol-in-julia

[CN] An Interesting Introduction: http://home.ustc.edu.cn/~rogerluo/sfdslides.html#/

[CN] Talking About Parallel Computing & Performance Tips: http://cos.name/wp-content/uploads/2013/05/changyou-Julia-20130518.pdf

Interesting Automatic Differentiation/Gradient Packages: http://www.juliadiff.org/

Neural networks and a dive into Julia: http://blog.yhathq.com/posts/julia-neural-networks.html

Tricked out iterators in Julia: http://slendermeans.org/julia-iterators.html

Julia for Machine Learning: http://www.cs.toronto.edu/~jsnell/assets/julia-tutorial.pdf

Running Shell Commands from Juliahttp://blog.leahhanson.us/running-shell-commands-from-julia.html

A Weekend With Julia: An R User’s Reflections: http://www.econometricsbysimulation.com/2014/04/a-weekend-with-julia-r-users-reflections.html

A Lisper’s first impression of Julia: http://p-cos.blogspot.sg/2014/07/a-lispers-first-impression-of-julia.html

Fun with Julia, metaprogramming and Sublime Text: http://reganmian.net/blog/2013/09/29/fun-with-julia-metaprogramming-and-sublime-text/

Fun With Just-In-Time Compiling: Julia, Python, R and pqR: http://randyzwitch.com/python-pypy-julia-r-pqr-jit-just-in-time-compiler/

Related Blogs

Cultivating a Simple Life.: http://kdr2.com/

Chiyuan Zhang: http://pluskid.org/

A Julia Language Blog Aggregator: http://www.juliabloggers.com/

Continue reading Collections on Julia

Another Day on Julia

Feelings During Writing

There was a time when I thought this language was about to get out of my control. How could I write such codes and my dear how could it executed the codes in such a way!

Calculating 9,000,000,000+ (9 billion+) Cosine Distances on 20 Processes Simultaneously
Calculating 9,000,000,000+ (9 billion+) Cosine Distances on 20 Processes Simultaneously

JLD Performance on Different String Types

It seems Julia is slow dealing with kinds of strings, so does the JLD serializations. I compared the following:

# a little test
using JLD
a = rand(10000000) # 10000000-element Array{Float64,1}
b = [string(i) for i in a] # 10000000-element Array{Any,1}
@time save("./test_float.jld","_",a) # 1.100200 seconds (1.62 M allocations: 72.982 MB)
@time save("./test_string.jld","_",b) # terminated when running 5+ minutes

While I suspected that whether the type of array affected the efficiency, I continued the following:

# a little test
using JLD
a = rand(10000000) # 10000000-element Array{Float64,1}
b = [string(i) for i in a] # 10000000-element Array{Any,1}
c = Array{AbstractString,1}(b) # 10000000-element Array{AbstractString,1}
d = Array{ASCIIString,1}(b) # 10000000-element Array{ASCIIString,1}
e = Array{UTF8String,1}(b) # 10000000-element Array{UTF8String,1}
@time save("./test_abstr.jld","_",c) # terminated when running 5+ minutes
@time save("./test_ascstr.jld","_",d) # 2.094302 seconds (10.06 M allocations: 231.583 MB)
@time save("./test_utfstr.jld","_",e) # 3.230833 seconds (10.05 M allocations: 231.363 MB, 44.33% gc time)

That’s quite a relief! Using some definite type of array definitions may probably accelerate the serialization. In some cases it seemed that JLD has flopped into a infinite loop (I don’t know whether there’s a bug).
Continue reading Another Day on Julia

One Day on Julia

Meeting Julia at Coffee Break

It was a sunny day when I gradually got tired of some programming language, which is slow to some degree. And above all I made no actual progress on efficiency of code execution during the past few years, of course my laziness should account for which. Finally, during the coffee break of my boss, I had a glimpse of Julia.

And I think this would help me out after some quite readings.

Here’s what I read, and what impressed me most is definitely the performance of Julia compared to other languages:

  • Julia’s Role in Data Science
  • Official Website of Julia
    • You may find a lot of useful introductions and tutorials here. There’s a collection of them on the website which helped me a lot.
    • Currently the some of the documents of the website need updates, and for newer discussions you may want to go the their github site.
  • Source of Julia (GitHub) (Just for more reference, of course I was not reading the source code…)
    • Here’s where most recent issues will be discussed, better visited when you dig into the language.
Julia Performance Compared to Other Languages
Julia Performance Compared to Other Languages (From the Official Website)

I saw comments on defects of the experiment settings, but my passion urged me to continue finding out the truth by myself, and here we go.

Continue reading One Day on Julia