What should Julia users know about?

2022-11-28 — (12 min read)

Julia is a programming language designed to get high performance with code that’s easy to read and write. It’s really good! You should try it. If you already have, here’s my list of broadly useful things to be aware of that you might miss by just diving in.

I don’t mean for this to be overwhelming. It’s not homework or anything. A good way to get into Julia is to start using it for a problem you’re interested in, picking up as much as you need to get things done. But at some point you’ll want to know what you don’t know.

I’ve tried to organize things starting with the most important and basic information, both along the whole page and within each section. These are personal suggestions.⁺ ⁺ I’m an intermediate Julia user, having made some small contributions to the the ecosystem and having done a few years of development for private scientific projects. ⏎ If you have ideas about what should go here, let me know on Twitter.

Community

The Discourse board is the best place for questions and general discussion.⁺ ⁺ In particular, it’s more active than StackExchange. ⏎ There’s also a Slack, the Forem (a sort of community blog), and a whole bunch of people worth following on Twitter.

Contributing to the Julia ecosystem is a great way to get a handle on the tacit or idiomatic stuff you don’t get out of documentation or blog posts.

A good way to find packages in a given domain is to browse the GitHub Organizations that the community groups related packages into. There are thousands of registered packages in the ecosystem, and if you search or ask around on Discourse and Slack you might find what you’re looking for—or at least inspiration.

The language

I recommend reading through the manual to at least Documentation as a start. From the later sections I’ll single out the Performance Tips and Style Guide. It’s not necessarily obvious, but those last two are very useful for understanding the language.⁺ ⁺ As far as style, the Blue guidelines have more detailed recommendations. ⏎

In the Julia REPL, you can type ? to enter help mode, where Julia will attempt to show documentation for anything you enter (and fuzzy match suggestions if it doesn’t find anything). Function documentation will often mention related functions, which is useful for discovery. The standard library has InteractiveUtils.methodswith, which can help you discover methods if you miss finding class methods in an object oriented language with tab completion.

I find it pays off to skim documentation even for basics like numbers, strings, and arrays, at least to get an idea of the kind of affordances Julia provides. You’ll be more likely to look something up when you can tell it should exist.

For example, I’ll call out the collections and data structures and iteration utilities, some of which I wouldn’t have guessed were built into the language but which are very useful.

Similarly, there are mathematical operators in base Julia it’s worth knowing exist: mod2pi and rem2pi and the functions sincos, cis, sinpi, sind, and so on for trigonometry saving some effort and numerical exactness; similarly, expm1 and log1p; hypot and LinearAlgebra.norm for vector norm calculations avoiding overflow and underflow; various other utilities like clamp and mod1.

The manual’s noteworthy differences from other languages mentions a wide assortment of features and conventions even if you’re not coming from MATLAB, R, Python, C/C++, or Common Lisp.

Stefan Karpinski’s 2019 talk The Unreasonable Effectiveness of Multiple Dispatch is great if you want to understand why some people like Julia so much, along with some of the choices Julia makes as a language.^a ^a I don’t know if it belongs in the canon, but my favorite JuliaCon talk may be Taking Vector Transposes Seriously from Jiahao Chen in 2017. It’s about Julia pre-1.0, so be careful—for example, these days the single quote gives you the adjoint, not the transpose. ⏎

Workflow

For workflow basics, start with the Julia REPL, Workflow Tips, and the package manager documentation up through working with environments. If you’re developing a package, Revise.jl is basically mandatory to avoid constantly restarting Julia.⁺ ⁺ Even if you’re just loading code from a file using include, Revise gives you the includet method for “include and track”, so if you edit the file the changes are automatically loaded in your Julia session. I think some people miss this feature because it’s not the headline use case for Revise, but it’s nice. ⏎

VS Code with the Julia extension is the recommended editor. Some of what you’ll find in the user guide for Julia in VS Code you’d expect from any IDE—formatting, linting, running code—and some of it is Julia-specific, like using sysimages to start sessions faster (via PackageCompiler).

Pluto gives you a notebook like Jupyter, but without hidden state from out-of-order execution and with a source file that’s valid Julia
UnicodePlots is a surprisingly richly featured backend for Julia’s main plotting interface Plots.jl that works in a terminal
PrettyTables prints tabular data in a terminal in a very readable way
OhMyREPL adds some features to the Julia REPL like customizable syntax highlighting and rainbow parethesis matching
PackageCompiler, mentioned above, can be used with or without the VS Code extension
PkgTemplates is the canonical tool for creating new packages—which isn’t hard to do by hand, but PkgTemplates is friendlier and can be customized to set up a repository configured for CI, code coverage, documentation, and so on
JuliaFormatter is a customizable code formatter that also integrates well with the VS Code extension

The Parameters pattern

I work with modeling problems involving many instances of components each with many parameters. It’s hard to manage this situation elegantly together with requirements like default parameters, keyword constructors, efficient access to and mutation of parameters, lightweight representations that can be used directly in optimizers and differential equations, and so on. It’s enough of a frustration in other languages that it gets its own heading here.

One option is to use NamedTuples to hold parameters. You might like NamedTupleTools which makes some things more straightforward. Also, don’t miss that property destructuring syntax was added in 1.8.
The next step up is probably to define structs using Base.@kwdef to automatically get keyword constructors with default fields (Parameters.jl also does this, but it’s not necessary now that @kwdef is all grown up⁺ ⁺ it’s been around for a while, but for obscure reasons @kwdef will only be exported and part of the public API as of Julia 1.9 ⏎ )
Accessors (successor to Setfield) gives you a simple way to update immutable data structures
StructArrays defines an array type that acts like an array of struct elements but is stored internally as a list of arrays (potentially allowing, for example, effective use of SIMD without sacrificing ergonomics; see AoS and SoA)
ComponentArrays is a really nice implementation of array types that act like a mutable NamedTuple or struct, but which compose with DifferentialEquations.jl, Optim.jl, or really anything that expects arrays
If you find yourself reaching for this design pattern, ModelingToolkit.jl might interest you

Other packages

Some of the above are taken from the October 2022 thread “Packages all Julians should know about” on the Julia Discourse board. Here are some more:

Documenter is what people use to make documentation, and it has some good features like doctests and cross references worth getting familiar with
Literate.jl is a package for Literate Programming that lets you generate markdown for Documenter or a Jupyter notebook from the same source file
Plots.jl with your backend of choice is fine, but I’d recommend looking at Makie, especially for interactivity^b ^b although Plots.jl does have the PlotlyJS backend ⏎ and performance (with an OpenGL backend)—Beautiful Makie has good examples
DataStructures.jl implements standard data structures
DataFrames.jl implements pandas-like DataFrames—don’t miss the tutorial by Bogumił Kamiński
Memoize.jl gives you easy memoization (there’s also Memoization.jl, which at this point may be more general than Memoize)
MacroTools.jl is handy if you’re writing macros
Transducers.jl gives you transducers, which you might know from Clojure^c ^c this description of good use cases for transducers is good to have in the back of your head ⏎
The packages under the JuliaMath GitHub Organization include special functions, intervals, interpolations, root finding, and numerical integration, among others
JuliaDiff has various automatic differentiation packages under its umbrella

Of course there are also more domain-specific organizations and libraries that might as well be standard, like JuliaGraphs/Graphs.jl for graph things or JuliaDSP/DSP.jl⁺ ⁺ that’s where the phase unwrapping utility is, by the way ⏎ for digital signal processing; canonical libraries that have their particular ways of doing things but are nonetheless very generic, like Flux for machine learning; even things outside Julia’s advertised wheelhouse like Franklin for static site generation and the web framework Genie.

Performance, correctness, analysis

Start with Unit Testing and the Performance Tips in the Julia manual, as well as the section on Profiling. There’s also Logging in the standard library.

TestItems/TestItemRunner are works in progress but very promising for improving testing workflows
BenchmarkTools gives you better ways to benchmark than using @time to time commands
ProfileView is a visualizer for the results of profiling (one of several; the VS Code extension also has its own)
Infiltrator gives you a macro that acts as a breakpoint
StaticArrays; torrance on the Discourse thread explains it well:

This package allows for small, statically sized arrays. Because their size is known at compile time all sorts of optimisations and efficiencies kick in. Linear algebra operations, for example, are customised at compile time for the specific size of your matrix or array. For me, however, the biggest win is that static arrays are isbits, meaning these will be allocated on the stack not the heap, so these are ideal for cases where you’re working with small arrays of a known size in hot loops.
Tullio is a neat way to write tensor operations that will take advantage of multithreading, SIMD (with LoopVectorization), and other tricks to go fast
JET and Cthulhu do type-level program analysis that can catch bugs and type-unstable code
People seem to be adopting Aqua for some other checks

Cool stuff

JuMP and its supporting packages for mathematical optimization as well as DifferentialEquations.jl are probably the best in their respective classes in any language.

Cassette lets you do custom compiler passes, which enables all sorts of cool things
I mentioned I was excited about Symbolics.jl/SymbolicUtils back in January. They’ve made a lot of progress but still suffer from the limitations I mentioned then:

The home page calls it a Computer Algebra System, but right now developers seem more focused on symbolic-numeric computation. It’s in an odd situation where it has trouble simplifying arithmetic expressions with nested subtraction and division, but it can dramatically accelerate differential equation solvers. I think the potential here is underappreciated in scientific computing.
Metatheory.jl, the e-graph rewriting backend for Symbolics
Diffractor.jl for autodiff

Again, I’m on Twitter if you have more tips.

Muireall