Thursday, July 7, 2016

Story of two sons (CSON n GSON)

Two sons they are similar but not compatible to each other. First one is com.google.gson.JsonObject (final and not serializable) and another one is com.couchbase.client.java.document.json.JsonObject (extendable, serializable (!! handsome !!)).

We are storing com.google.gson.JsonObject as String in couchbase.  Three main operations are
  • Create entry in couchbase
             CouchbaseClient.set(key, 0, JsonObject.toString());
             Second parameter is expiry.
  • Retrieve entry from couchbase
               CouchbaseClient.get(key);
  • Remove entry from couchbase
             CouchbaseClient.delete(key);

First level cache - It is within same JVM (LRU Linked Hash Map storing com.google.gson.JsonObject as value).

Second level cache - In different machine, stored in Couchbase.
We tried to improve the processing time and memory foot print. 
  • Studied following alternatives.
    • MapDB (Stored values should be serializable. JsonObject is not serializable)
    • JCache (Same serialization problem)
    • EhCache (Worked, but slower than existing cache)
    • GuavaCache (Worked, but slower than existing cache)
    • Caffeine (better version of GuavaCache, but not tried)
  • Other way around whether we can improve in storing Couchbase.
    • Tried with Java client sdk 2.3.1. 
    • It has no direct method to store string.
    • To store the string, from the string i have to create either any three of these documents.
      • JsonStringDocument
      • RawJsonDocument
      • JsonDocument.
    • Among all three, JsonStringDocument has better performance.

  • JsonStringDocument -> Couchbase and Couchbase -> JsonStringDocument.
    Benchmark
    Mode
    Cnt
    Score
    Error
    Units
    get
    thrpt
    200
    3027.148
    ± 118.023
    ops/s
    put
    thrpt
    200
    5510.690
    ± 265.166
    ops/s
    remove
    thrpt
    200
    5825.239
    ± 208.079
    ops/s

  • Using couchbase client (existing one) 
  • Benchmark
    Mode
    Cnt
    Score
    Error
    Units
    get
    thrpt
    200
    40245.268
    ± 4516.327
    ops/s
    put
    thrpt
    200
    6718.834
    ± 345.201
    ops/s
    remove
    thrpt
    200
    223120.874
    ± 34296.307
    ops/s

  • What left ? If we can store directly, com.couchbase.client.java.document.json.JsonObject ?

  • Benchmark
    Mode
    Cnt
    Score
    Error
    Units
    get
    thrpt
    200
    2901.052
    ± 105.179
    ops/s
    put
    thrpt
    200
    5157.517
    ± 272.308
    ops/s
    remove
    thrpt
    200
    5403.609
    ± 213.671
    ops/s

  • Try again
  • Benchmark
    Mode
    Cnt
    Score
    Error
    Units
    get
    thrpt
    200
    2545.066
    ± 164.031
    ops/s
    put
    thrpt
    200
    5100.757
    ± 272.308
    ops/s
    remove
    thrpt
    200
    5125.290
    ± 282.888
    ops/s

  • No way, have to think something different.
Inside a document
id - per bucket unique document id.
cas - (compare and swap value of the document) - Couchbase server does not support multi document transaction or rollback. To support optimistic locking couchbase  utilizes a compare and swap approach. Whenever a document is mutated its cas value got changed.
expiry -
content - Actual JSON content.
multi token - bucket id, bucket UUID, sequence number, bucket.
Transcoder - handles serialization and deserialization.

What next ?
Get rid of both the sons and storing directly JsonDocument in local LRU cache.( com.couchbase.client.java.document.json.JsonObject  is still there hiding behind JsonDocument).

Performance of new LRU map <String, com.couchbase.client.java.document.JsonDocument>


Benchmark
Mode
Cnt
Score
Error
Units
get
thrpt
200
14960095.580
± 350243.181
ops/s
put
thrpt
200
 3292160.607
± 107798.708
ops/s
remove
thrpt
200
16809842.641
± 477968.628
ops/s

Storing directly JsonDocument in Couchbase .



Benchmark
Mode
Cnt
Score
Error
Units
get
thrpt
200
5565.558
± 219.635
ops/s
put
thrpt
200
2316.319
± 135.566
ops/s
remove
thrpt
200
3998.472
± 281.812
ops/s

Benchmark
Mode
Cnt
Score
Error
Units
get
thrpt
200

6002.283
± 219.635
ops/s
put
thrpt
200

3042.465
± 135.566
ops/s
remove
thrpt
200

5551.670
± 281.812
ops/s

With String->Couchbase
Couchbase->String
Benchmark
Mode
Cnt
Score
Error
Units
get
thrpt
200

40245.268

± 4516.327
ops/s
put
thrpt
200

6718.834

± 345.201
ops/s
remove
thrpt
200

223120.874

± 34296.307
ops/s



With com.google.gson.JsonObject -> toString -> Couchbase
Couchbase -> String - > JsonParser - > com.google.gson.JsonObject

Benchmark
Mode
Cnt
Score
Error
Units
get
thrpt
200

20918.810
± 1541.196
ops/s
put
thrpt
200

5707.927
± 303.200
ops/s
remove
thrpt
200

195069.639
± 30518.167
ops/s




Benchmark
Mode
Cnt
Score
Error
Units
get
thrpt
200
21455.698
± 1783.191
ops/s
put
thrpt
200
5849.519
± 358.121
ops/s
remove
thrpt
200
216540.661
± 34494.645
ops/s


With com.couchbase.client.java.document.JsonDocument -> toString -> Couchbase
Couchbase -> String - > JsonParser - > com.couchbase.client.java.document.JsonDocument

Why storing of JsonDocument in couchbase is slower ?

Are these two methods are heavy ?
  /**
     * Helper method to write the current document state to the output stream for serialization purposes.
     *
     * @param stream the stream to write to.
     * @throws IOException
     */
    protected void writeToSerializedStream(ObjectOutputStream stream) throws IOException {
        stream.writeLong(cas);
        stream.writeInt(expiry);
        stream.writeUTF(id);
        stream.writeObject(content);
        stream.writeObject(mutationToken);
    }

    /**
     * Helper method to create the document from an object input stream, used for serialization purposes.
     *
     * @param stream the stream to read from.
     * @throws IOException
     * @throws ClassNotFoundException
     */
    @SuppressWarnings("unchecked")
    protected void readFromSerializedStream(final ObjectInputStream stream) throws IOException, ClassNotFoundException {
        cas = stream.readLong();
        expiry = stream.readInt();
        id = stream.readUTF();
        content = (T) stream.readObject();
        mutationToken = (MutationToken) stream.readObject();

    }

Memory Mapped Files with MapDB (using HTreeMap)

Benchmark
Mode
Cnt
Score
Error
Units
get
thrpt
200

2171957.849

± 87802.008
ops/s
put
thrpt
200
174.786
± 8.070
ops/s
remove  
(without commit)
thrpt
200
11142.447 
± 34494.645
ops/s


Memory Mapped Files with MapDB (using HTreeMap with Serializer)

Benchmark
Mode
Cnt
Score
Error
Units
get
thrpt
200

8035837.551

± 183374.533
ops/s
put
thrpt
200

322.792
± 10.159
ops/s
remove
thrpt
200
352.866 
± 15.429
ops/s

Memory Mapped Files with MapDB (using BTreeMap)

Benchmark
Mode
Cnt
Score
Error
Units
get
thrpt
200

146187.591

± 3141.978
ops/s
put
thrpt
200

143.335

± 3.272 
ops/s
remove
thrpt
200
135.999 
± 4.317
ops/s


MapDB (using MemoryDirectDB with explicit key (Serializer.STRING) and value serializer (Serializer.JAVA))

Benchmark
Mode
Cnt
Score
Error
Units
get
thrpt
200

7339010.108

± 141256.722
ops/s
put
thrpt
200

58496.491

 ± 1858.172
ops/s
remove
thrpt
200
30806.259 

 ± 1232.421
ops/s


Memory:


With following code  (Reference: Performance Mmap)
public MapDBPerformanceMmap() {
File file = null;

try {
file = File.createTempFile("mapdb", "mapdb");
file.delete();
} catch (IOException ioe) {
ioe.printStackTrace();
}

DB db = DBMaker.fileDB(file).fileMmapEnable() // Always enable mmap
.fileMmapEnableIfSupported() // Only enable mmap on supported
// platforms
.fileMmapPreclearDisable() // Make mmap file faster

// Unmap (release resources) file when its closed.
// That can cause JVM crash if file is accessed after it was
// unmapped
// (there is possible race condition).
.cleanerHackEnable().make();

// optionally preload file content into disk cache
db.getStore().preallocate();

container = db.hashMap("subscriber").keySerializer(Serializer.STRING).valueSerializer(Serializer.JAVA)
.createOrOpen();

}

Benchmark
Mode
Cnt
Score
Error
Units
get
thrpt
200


7270003.946


± 239261.466
ops/s
put
thrpt
200


24076.464

±   3939.325 
ops/s
remove
thrpt
200
6855080.382 

± 163478.698
ops/s

In the above one, there is no commit at put and remove. Performance is faster.

Trying with volume

public MapDBVolume() {
File f = File.createTempFile("some", "file");

Volume volume = MappedFileVol.FACTORY.makeVolume(f.getPath(), false);

boolean contentAlreadyExists = false;
// DB db = DBMaker.volumeDB(volume, contentAlreadyExists).make();
DB db = DBMaker.onVolume(volume, contentAlreadyExists).make();

container = db.hashMap("subscriber").keySerializer(Serializer.STRING).valueSerializer(Serializer.JAVA)
.createOrOpen();

Benchmark
Mode
Cnt
Score
Error
Units
get
thrpt
200


7465024.337



± 121764.657
ops/s
put
thrpt
200



37520.380

±   6525.319 
ops/s
remove
thrpt
200

6371964.849


 ± 411445.209
ops/s

Storing <String,String> in MapDB

Benchmark
Mode
Cnt
Score
Error
Units
get
thrpt
200


7649561.628



±  98402.593
ops/s
put
thrpt
200



185419.302

±   2909.109 
ops/s
remove
thrpt
200

6678492.739


 ± 175966.643
ops/s

When we started with LRUMap and string our initial bench mark was
Benchmark
Mode
Cnt
Score
Error
Units
get
thrpt
200



5350617.205




±  38618.235
ops/s
put
thrpt
200




1882582.174

±  47825.210 
ops/s
remove
thrpt
200


5215924.639


 ± 155916.934
ops/s

Then converting this LRUMap to String,JsonDocument we got following results this is the best in terms operational speed.
Benchmark
Mode
Cnt
Score
Error
Units
get
thrpt
200
14960095.580
± 350243.181
ops/s
put
thrpt
200
 3292160.607
± 107798.708
ops/s
remove
thrpt
200
16809842.641
± 477968.628
ops/s


Questions

  1. Whether store as  String,  JsonDocument,  JsonStringDocument,  JsonRawDocument, JsonObject?
  2. Which will be better in terms of speed?
  3. Which will be better in terms of memory?
  4. Which will be better in overall aspects?
  5. What is the best way to quantify the results, currently using JMH ? Is it properly used ?
  6. How much entry we want to keep in the cache? 
  7. What will be TTL (Time to live) configuration?
  8. What is the character of data?
    1. Is it read intensive?
    2. Is it write intensive?
    3. Is it read write balanced?
  9. Is the data serializable?
  10. How to measure cache performance?
    1. Cache hit ratio
    2. Cache miss ratio



Conclusion

  1. In consideration of speed, storing in LRU map <String, com.couchbase.client.java.document.JsonDocument> is better.
  2.  Inserted one billion entries in MapDB BTreeMap.

This is the memory state while inserting one billion entries in MapDB BTreeMap (Using FileDB). 
  • GC activity is much less.








Trying to insert one billion entry in LRU Linked Hash Map
 Time Taken: 05:45 min

Using MemoryDirectDB and HTreeMap.



This one did not completed. It crashed after half billion entries.

Final War

Code sample
for (long i = 0; i < ONE_BILLION; i++) {
try {

cache.put(TestUtil.generateRandomMSISDN(), subscriberDocument);
cache.get(TestUtil.generateRandomMSISDN());
cache.remove(TestUtil.generateRandomMSISDN());

} catch (Throwable th) {
System.err.println("!! ERR, Sorry can not continue after " + i + " !!");
break;
}
}

With LRU Map <String,JsonDocument>


Time taken : 7 min 07 sec

With MapDB HTreeMap <String,JsonDocument>

DB db = DBMaker.memoryDirectDB().make();
jsonDocumentHTreeMap = (HTreeMap<String, JsonDocument>) db.hashMap("MAPDB_MEMORY_HTREE_DOC")
.expireMaxSize(starterSize).keySerializer(Serializer.STRING).valueSerializer(Serializer.JAVA)
.expireAfterCreate(10,TimeUnit.MINUTES).createOrOpen();


Total time taken 01:27 Hour

Based on operational speed.

Benchmark
Mode
Cnt
Score
Error
Unit
Rank
ehcacheGet    
thrpt 
200 
10671607.756
± 317935.925 
ops/s
3
ehcachePut
thrpt 
200 
488244.207
±  14400.238
ops/s
ehcacheRemove 
thrpt 
200 
6514443.959
± 155838.552 
ops/s
lruGet        
thrpt 
200 
15513985.643
± 305717.633 
ops/s
1
lruPut
thrpt
200
5310197.732
± 280607.035 
ops /s
lruRemove     
thrpt
200
15901187.981
± 248035.602 
ops/s
guavaGet    
Thrpt
200
263205.786
±   6233.297 
ops/s
4
guavaPut
Thrpt
200
1140300.211
±  53700.048

ops/s
guavaRemove
thrpt
200
6951390.457
± 154189.470 
ops/s
mapDBDMGet
thrpt
200
6542146.702
± 122820.660
ops/s
5
mapDBDMPut
thrpt
200
20222.411
± 2847.877
ops/s
mapDBDMRemove
thrpt
200
2831093.336
±  73676.065
ops/s
caffeineGet    
thrpt
200
14310655.071
± 248814.839 
ops/s
2
caffeinePut    
thrpt
200
1057360.774
±  41566.312 
ops/s
caffeineRemove
thrpt
200
13894384.333
± 132390.368 
ops/s





Other way to measure cache performance 

  • cache-hitWhen a data element is requested of the cache and the element exists for the given key, it is referred to as a cache hit (or simply ‘hit’).
  • cache-missWhen a data element is requested of the cache and the element does not exist for the given key, it is referred to as a cache miss (or simply ‘miss’).
  • Factors that affect the efficiency of a cache are:
    • liveness—how live the data needs to be. The less live, the more it can be cached
    • proportion of data cached—what proportion of the data can fit into the resource limits of the machine. For 32-bit Java systems, there was a hard limit of 2 GB of address space. 64-bit systems do not have that constraint, but garbage collection issues often make it impractical to have the Java heap be large. Various eviction algorithms are used to evict excess entries.
    • Shape of the usage distribution—If only 300 out of 3000 entries can be cached, but the Pareto (80/20 rule) distribution applies, it might be that 80% of the time, those 300 will be the ones requested. This drives up the average request lifespan.
    • Read/Write ratio—The proportion of times data is read compared with how often it is written. Things such as the number of empty rooms in a hotel change often, and will be written to frequently. However the details of a room, such as number of beds, are immutable, and therefore a maximum write of 1 might have thousands of reads.

How other people are using MapDB?

 OpenTripPlanner  -Using DB TreeMaps is observed not to be slower than memory.
 HashMaps are both bigger and slower. It lets you run in 400MB instead of a few GB.


With new benchmark

Using new benchmark (computation) comparison on computing same key and spread key,
got the following performance.



LRUPandaCache Caffeine Guava MapDB EhCache
compute_sameKey 189874764.5 172338804.8 43334696.24 469547.183 4231434.466
compute_spread 22218880.26 73839767.57 26506703.39 490255.326 7330457.91


Hang On! We started with the objective of reducing memory foot print. Where is that comparison?

With MemoryBenchmark key is String and value com.google.gson.JsonObject.

Unbounded


Cache
Baseline
Per Entry
Caffeine
264 bytes
10 bytes (16 aligned)
Guava
1,032 bytes
16 bytes (16 aligned)
PandaCache (LRU Linked HashMap)
64 bytes
166 bytes (168 aligned)


After looking all these feel it is better to keep GSON (com.google.gson.JsonObject) with Caffeine caretaker.

Reference
Inside the document
Amdahl's Law
Hit Ratio