Java Unsafe

http://mishadoff.com/blog/java-magic-part-4-sun-dot-misc-dot-unsafe/

 

Java Magic. Part 4: sun.misc.Unsafe

Java is a safe programming language and prevents programmer from doing a lot of stupid mistakes, most of which based on memory management. But, there is a way to do such mistakes intentionally, using Unsafe class.

This article is a quick overview of sun.misc.Unsafe public API and few interesting cases of its usage.

Unsafe instantiation

Before usage, we need to create instance of Unsafe object. There is no simple way to do it like Unsafe unsafe = new Unsafe(), because Unsafe class has private constructor. It also has static getUnsafe() method, but if you naively try to call Unsafe.getUnsafe() you, probably, get SecurityException. Using this method available only from trusted code.

public static Unsafe getUnsafe() {
 Class cc = sun.reflect.Reflection.getCallerClass(2);
 if (cc.getClassLoader() != null)
 throw new SecurityException("Unsafe");
 return theUnsafe;
}

This is how java validates if code is trusted. It is just checking that our code was loaded with primary classloader.

We can make our code “trusted”. Use option bootclasspath when running your program and specify path to system classes plus your one that will use Unsafe.

java -Xbootclasspath:/usr/jdk1.7.0/jre/lib/rt.jar:. com.mishadoff.magic.UnsafeClient

But it’s too hard.

Unsafe class contains its instance called theUnsafe, which marked as private. We can steal that variable via java reflection.

Field f = Unsafe.class.getDeclaredField("theUnsafe");
f.setAccessible(true);
Unsafe unsafe = (Unsafe) f.get(null);

Note: Ignore your IDE. For example, eclipse show error “Access restriction…” but if you run code, all works just fine. If the error is annoying, ignore errors on Unsafe usage in:

Preferences -> Java -> Compiler -> Errors/Warnings ->
Deprecated and restricted API -> Forbidden reference -> Warning

Unsafe API

Class sun.misc.Unsafe consists of 105 methods. There are, actually, few groups of important methods for manipulating with various entities. Here is some of them:

  • Info. Just returns some low-level memory information.
    • addressSize
    • pageSize
  • Objects. Provides methods for object and its fields manipulation.
    • allocateInstance
    • objectFieldOffset
  • Classes. Provides methods for classes and static fields manipulation.
    • staticFieldOffset
    • defineClass
    • defineAnonymousClass
    • ensureClassInitialized
  • Arrays. Arrays manipulation.
    • arrayBaseOffset
    • arrayIndexScale
  • Synchronization. Low level primitives for synchronization.
    • monitorEnter
    • tryMonitorEnter
    • monitorExit
    • compareAndSwapInt
    • putOrderedInt
  • Memory. Direct memory access methods.
    • allocateMemory
    • copyMemory
    • freeMemory
    • getAddress
    • getInt
    • putInt

Interesting use cases

Avoid initialization

allocateInstance method can be useful when you need to skip object initialization phase or bypass security checks in constructor or you want instance of that class but don’t have any public constructor.

Memory corruption

This one is usual for every C programmer. By the way, its common technique for security bypass.

sizeOf

Using objectFieldOffset method we can implement C-style sizeof function.

Shallow copy

Having implementation of calculating shallow object size, we can simply add function that copy objects. Standard solution need modify your code with Cloneable, or you can implement custom copy function in your object, but it won’t be multipurpose function.

Hide Password

One more interesting usage of direct memory access in Unsafe is removing unwanted objects from memory.

Most of the APIs for retrieving user’s password, have signature as byte[] or char[]. Why arrays?

It is completely for security reason, because we can nullify array elements after we don’t need them. If we retrieve password as String it can be saved like an object in memory and nullifying that string just perform dereference operation. This object still in memory by the time GC decide to perform cleanup.

Multiple Inheritance

There is no multiple inheritance in java.

Correct, except we can cast every type to every another one, if we want.

Dynamic classes

We can create classes in runtime, for example from compiled .class file. To perform that read class contents to byte array and pass it properly to defineClass method.

Throw an Exception

Don’t like checked exceptions? Not a problem.

getUnsafe().throwException(new IOException());

This method throws checked exception, but your code not forced to catch or rethrow it. Just like runtime exception.

Fast Serialization

This one is more practical.

Everyone knows that standard java Serializable capability to perform serialization is very slow. It also require class to have public non-argument constructor.

Externalizable is better, but it needs to define schema for class to be serialized.

Popular high-performance libraries, like kryo have dependencies, which can be unacceptable with low-memory requirements.

But full serialization cycle can be easily achieved with unsafe class.

By the way, there are some attempts in kryo to use Unsafe http://code.google.com/p/kryo/issues/detail?id=75

Big Arrays

As you know Integer.MAX_VALUE constant is a max size of java array. Using direct memory allocation we can create arrays with size limited by only heap size.

In fact, this technique uses off-heap memory and partially available in java.nio package.

Memory allocated this way not located in the heap and not under GC management, so take care of it using Unsafe.freeMemory(). It also does not perform any boundary checks, so any illegal access may cause JVM crash.

It can be useful for math computations, where code can operate with large arrays of data. Also, it can be interesting for realtime programmers, where GC delays on large arrays can break the limits.

Concurrency

And few words about concurrency with Unsafe. compareAndSwap methods are atomic and can be used to implement high-performance lock-free data structures.

CAS primitive can be used to implement lock-free data structures. The intuition behind this is simple:

  • Have some state
  • Create a copy of it
  • Modify it
  • Perform CAS
  • Repeat if it fails

Actually, in real it is more hard than you can imagine. There are a lot of problems like ABA Problem, instructions reordering, etc.

If you really interested, you can refer to the awesome presentation about lock-free HashMap

UPDATE: Added volatile keyword to counter variable to avoid risk of infinite loop.

Conclusion

Although, Unsafe has a bunch of useful applications, never use it.