In the company I worked a couple of years ago, we were building an internal Java framework for creating configuration applications. One day we received a feature request to support storing passwords in character arrays instead of strings, motivated by security reasons. The reporters wanted to clear the memory in a controlled way instead of waiting for garbage collection. But does it actually do the job in modern JVM applications? Let’s find out.

tl;dr;

We cannot erase sensitive data from strings until they are garbage-collected, because they are immutable. We can use character arrays instead, but this approach requires end-to-end support from all libraries and tools that we use.

Why we might want it?

In JVM, we cannot control when our objects are deleted and memory – released. In most cases, this is a blessing that not only simplifies the development, but also protects us from certain attack vectors. But when it comes to the sensitive data, things get complicated. Obviously we should remove them from the memory once we no longer need them, but without explicit management we cannot in fact control, when this removal would exactly happen. String objects in Java are immutable, and the only way is waiting until garbage collector notices such an unused string. If a crash occurs, producing a core dump, or someone gains unauthorized access to the JVM with the profiler, this data can be revealed.

Character arrays seem to be the way to workaround the issue. We still cannot control deleting them, but at least we can erase their content. On the other hand, they are not a widely used data type and we cannot expect a good support from libraries. In order to work with them, we might need to write some extra custom code.

Passwords in char arrays in practice

Let’s try to write a short web application in Java that stores sensitive data in character arrays.

Step 1: JPA

Let’s begin with creating a field in our JPA entity with type char[]:

@Entity
@Data   // this is from Lombok
public class User {
    @Id
    @Column
    private long id;

    @Column
    private String username;

    @Column
    private char[] password;
}

This part is trivial. In case of Hibernate, such a mapping just works. By looking at the table of type mappings we can see that char[] is recognized as CharacterArrayType and mapped into VARCHAR in the database.

Step 2: clearing unused data

The next step would be clearing the sensitive data on demand. Of course, it is us who know when to do it, so we need a custom method in our class:

@Entity
@Data   // this is from Lombok
public class User {
    @Id
    @Column
    private long id;

    @Column
    private String username;

    @Column
    private char[] password;

    public void clearSensitive() {
        Arrays.fill(password, '\0');
    }
}

A sample usage in Spring could look like this:

@RestController
@RequestMapping("/user")
@Transactional
public class UserResource {
    private final UserRepository userRepository;

    public UserResource(UserRepository userRepository) {
        this.somethingRepository = somethingRepository;
    }

    @PostMapping(consumes = "application/json")
    public ResponseEntity<String> addProduct(@RequestBody User user)
        throws URISyntaxException
    {
        userRepository.save(user);
        user.clearSensitive();
        return ResponseEntity.created(new URI("/user/" + user.getId())).build();
    }
}

Here we notice the first catch. In Java, the concept of clearing unused data does not exist, and the frameworks don’t support us in this task. This means that we need to remember about calling clearSensitive() every time we finish working with sensitive data. We might also consider implementing AutoCloseable interface, but it was originally created for a different purpose, and still – this is not magic. The library must recognize this interface in order to make use of it.

JDBC

So far it looks like we have achieved our goal. We store the password in a char array, we call clearSensitive() in many different places. But have we? Let’s take a closer look: we built our application with the help of various third-party libraries and a massive framework. Who said that all of them honor char[] type and do not change it into a string under the hood? If a single piece of code makes this conversion, our efforts are pointless.

So let’s dive into Hibernate. It does not talk to the database directly, but instead uses JDBC. It is a standard Java API for accessing relational database engines. Individual engines provide drivers, and we get a uniform API. To bind data with queries, Hibernate uses PreparedStatement instance from JDBC. If we look closer, we notice that it has many methods for binding different data types: setByte(), setInteger(), setString(). There’s no explicit method like setCharacterArray(), but it seems like setCharacterStream() can possibly do the job. Let’s figure it out by looking at the internal definition of CharArrayType in Hibernate:

public class CharArrayType extends AbstractSingleColumnStandardBasicType<char[]> {
    public static final CharArrayType INSTANCE = new CharArrayType();

    public CharArrayType() {
        super( VarcharTypeDescriptor.INSTANCE, PrimitiveCharacterArrayTypeDescriptor.INSTANCE );
    }

    public String getName() {
        return "characters"; 
    }

    @Override
    public String[] getRegistrationKeys() {
        return new String[] { getName(), "char[]", char[].class.getName() };
    }
}

Here we see nothing interesting. The class is just a simple definition and it delegates the actual binding to VarcharTypeDescriptor. But… this is the same descriptor that is used for String type, too! This is a warning sign and indeed, this is what we can find inside:

@Override
protected void doBind(PreparedStatement st, X value, int index, 
    WrapperOptions options) throws SQLException
{
    st.setString(index, javaTypeDescriptor.unwrap(value, String.class, options));
}

This single line of code means that Hibernate actually converts our character arrays to Strings in order to pass them to JDBC. Even if we erase our character array, the original content remains in memory in a second, immutable String object. We failed to achieve our goal.

In short…

Hibernate translates char[] type into String internally in order to pass it to JDBC API.

JSON

The second place to look at lies at the other side of the application. Most modern microservices exchange information in a textual JSON format. Spring and other web frameworks, nicely hide all the boilerplate code related to parsing JSON documents and producing Java objects from them. However, the code is still there and gets called with every request.

Let’s notice that JSON format does not have a concept of “sensitive data”. Every quoted value is just a string, and the parser does not have any way do distinguish sensitive data from non-sensitive. This means that it would process a field called “password” in exactly the same way, as – let’s say – “currentWeather”: the value would have String type. In theory, it is possible to build a parser that knows in advance which fields are sensitive, and uses this information for parsing, but personally I have never seen such a parser.

In short…

If you send sensitive data in JSON documents, the JSON parser will likely return it to you as a String object, because it does not know which fields contain sensitive data, and which – not.

Security recommendations

In SEI CERT Oracle Secure Coding Standard for Java we can find two recommendations about storing sensitive data:

  1. Rule MSC03-J
  2. Recommendation MSC59-J

Point 1 is rather obvious – we should never do so. On the other hand, the second recommendation is something that we look for. But here’s the thing: it discusses minimizing the lifetime, and implementing it is not a silver bullet. We cannot avoid loading this data into the memory, and there is always a chance that a lucky core dump records something critical. Do not understand it wrong. This recommendation makes a perfect sense; whenever possible you should follow it and use char arrays for sensitive data. But also remember that in JVM, you’d need a strong support from your dependencies in order to make it work as intended.

However, character arrays may protect us from a slightly different issue. A common problem in many applications is accidental exposure of senstitive data by writing large object graphs into debug logs. Character arrays are safer to use, because calling .toString() on them returns only a meaningless object ID, not their content. Another possible solution is using a wrapper type for such strings that returns empty value in .toString()

Conclusion

We can see that our original motivation for using char arrays is valid only, if we can guarantee that they will never be changed into String – both by our code and all third-party dependencies. Below, you can find my own personal check-list for sensitive data:

  • invest in char arrays and clearing them, if you can guarantee that the sensitive data never gets changed into String
  • avoid accidental writing sensitive data into logs or audit trail; character arrays may help you with that, but there are also other options,
  • focus on hardening elsewhere: block all diagnostic access ports on production systems, consider disabling core dumps on crash (all in all, crashes of JVM should never be your fault)

Sample code

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments