Is there a way converting bucket to cblite database file?

I know it sounds silly but please assume a scenario such below

  1. There is a bucket (restored from a back up file) that contains 700mb data. (around 1.5million records because all devices should run offline and yes all 1.5 million records must be stored)
  2. All cblite files where located in each client are deleted in a way. (Lets assume again all mobile devices are formatted)
  3. When a client gets started from scratch, sync gateway copies all the data from couchbase server to the client and it tooks long time as expected. it does not make sense at all when you think this process should be applied to all devices.

What Im doing is synching server with only one client (via sync gateway) and convert the cblite db into zip file and creating db in each client from this zip cblite file.

My question, is there a way to make cblite from bucket or is there an easy way to achieve this problem?

You’ve basically described exactly what you should do. Once you sync to a Couchbase Lite client, you can retrieve the database and use that to seed your devices via the replaceDatabase API. Note that this API is unfortunately not the same signature in every platform so without knowing which yours is I can’t give a specific example.

thanks for the answer @borrrden.

I’ve 2 different clients actually.

  1. Java client: It’s a simple program that takes csv files, creates a cblite database by those records in csv files, and pushes data to sync gateway. Program turns itself off having finished the pushing process. Besides that there is a scheduled job which converts the cblite file (which is created by the program) to zip and then deploy this zip file to an url every night where can be reached by android clients during running the app for the first time.

  2. Android client(s): when the app is getting started for the first time, it checks if there is a local db or not. If local db does not exist then app looks for the url which I’ve mentioned above then downloads and unzips it in a directory used by the app. Thanks to this process client(s) does not has to get 1.5 million records via sync gateway after the installation the app.

However while we work in test branch sometimes we need to recreate the bucket and restart the sync gateway because of the dirty records. Bucket gets restored by the back up file and 1.5 million records are pulled by a java client through sync gateway. Although this cblite database is being created through sync gateway, it does not work like in step 1.

I mean, cblite database which has been created by the replication is getting zipped and deployed once the replication is finished however clients still try to be synced as if they have different data, the app gets locked because of the replication and then gets collapsed within lots of out of memory exceptions. What I do not expect is after unzipped this file in any android client, there should not be any problem like step 1, android clients should run in a same way what we do in step 1.

I’m sorry for the long explanation but I needed to describe the situation. So what I need to know is how android clients can keep running with the latest data without sync necessary after buckets get restored.

@hideki Do you have a unit test for this? If you seed a database it should pick up replication where it left off.

@borrrden CBL Android/Java does not have the unit test that does pull replication into the prepared database.

@mustafaguven how do you replace database in your app?

I just read the following comment in our mailing list.
https://groups.google.com/forum/#!msg/mobile-couchbase/yD8N4zVp7nw/OGUhbkwpBQAJ

This mailing list user faces the problem because the app directly replaces the database files instead of using replaceDatabase method. Does your app use Manager.replaceDatabase(String databaseName, String databaseDir)?

Thanks!

Hi again and thank you for your help. Actually android clients had no any database until using the zip file because all cblite file in android clients are removed to be able to run the app from scratch as if it is their first time using. So I create cblite db using the zipped cblite file which is created by Java client through sync gateway. as I said, java client pulls the data from server and creates a cblite database through sync gateway then zips it and publish it for the Android clients.

How? This is probably the part that is not correct. You need to unzip it (I assume you didn’t alter the structure right?) and use the replaceDatabase API. You cannot simply write the database in the same place using your own method.

ofcourse I’m unzipping it. I could not explain clear enough I guess, there is no any local db in android clients to be replaced until I unzip the file. Local couchbase db in android gets removed before the process. Assume there is no local db, I only have a zipped couchbase lite database which is created through sync gateway in another client (java client). I’m zipping it and sharing it with the other clients, thats all. As I said, until then android clients have no db.

Btw, I’m giving this error once replication starts which is I never get this error before until db gets restored.

FATAL EXCEPTION: Thread-waitForPendingFutures[PusherInternal{http://10.100.0.156:4984/terminalonline/, push, 6adcf}]
java.lang.OutOfMemoryError: [memory exhausted]
at dalvik.system.NativeStart.main(Native Method)

I see what you mean now. However the API doesn’t need to replace a database even though that’s what it is called. If there is no database there it will just put it in.

so you mean, I should always use replaceDatabase instead of getExistingDatabase?

Not “always” but when you want to insert a database from an outside source, you should use replaceDatabase once. Then after that you can use getExistingDatabase.

I see, I try to do like as you said, after unzip the file first call replacedatabase then try to get database by using getexistingdatabase like before. nothing changed, Im getting out of memory errors due to replication. I wonder there is something wrong with the restored bucket but if I dont turn replication on, I can work with the documents. Im really desperate…

Can you show the code you are using and let us know what network traffic, if any, you are seeing logged by sync gateway?

EDIT: I also tried to copy the database without zipping using replacedDatabase. Again nothing changed. I'm totally sure 
its not about a zipping unzipping, replaceDatabase - getExistingDatabase issue. Do you think it is related to backup - 

restore issue?

@Override public void onResume() {
    if (applicationCache.getDatabaseHelperState() != DatabaseHelperState.UNZIPPING) {
      verifyDatabase();
    }
  }

  private void verifyDatabase() {
    databaseHelper.verify().subscribe(new BaseSubscriber<Boolean>() {
      @Override public void onNext(Boolean isVerified) {
        if (isVerified) {
          view.onDatabaseVerified();
        } else {
          if (databaseHelper.isDownloadRequired()) {
            view.onDownloadRequiredForDatabase();
          } else {
            view.showProgress(R.string.unzipping_database_dialog_description);
            view.onUnzipRequiredForDatabase();
          }
        }
      }
    });
  }


  //below methods are placed in different class,
  //I'm sharing it to show the way how I unzip the zipped file
  @Override public Observable<Boolean> unzip() {
    applicationCache.updateDatabaseHelperState(DatabaseHelperState.UNZIPPING);
    return Observable.create((Observable.OnSubscribe<Boolean>) subscriber -> {
      try {
        fileUtil.unzip(fileUtil.getDownloadedDatabaseFile().getPath(),
            manager.getContext().getFilesDir());
        manager.replaceDatabase(fileUtil.getDatabaseName(), fileUtil.getDatabaseAbsolutePath());
        Timber.d("%s unzipping is finished successfully", fileUtil.getDatabaseName());
        applicationCache.updateDatabaseHelperState(DatabaseHelperState.UNZIPPING_SUCCESFULLY);
        subscriber.onNext(true);
      } catch (IOException e) {
        applicationCache.updateDatabaseHelperState(DatabaseHelperState.UNZIPPING_FAILED);
        subscriber.onError(e);
      }
    }).subscribeOn(worker).observeOn(main);
  }

  @Override public Database getDatabase() {
    try {
      this.database = manager.getExistingDatabase(fileUtil.getDatabaseName());
    } catch (CouchbaseLiteException e) {
      this.database = null;
    }
    return this.database;
  }

You mention somewhere above that you are observing the sync issue only when you clear the bucket , reimport data into bucket and then restart the sync gateway. So what do the clients do when you clear bucket ? As in, do the Android clients have a local copy of the previously unzipped .cblite (the old data) or do you clear the local database and repeat the process of reimporting the zip file and start fresh?

Hi @priya.rajagopal,

All local databases are removed. Like I said before, assume there is no local db as if the app is getting installed for the first time.

I believe that there must be an issue due to backing up and restoring the file. If you pull documents from a restored bucket via sync gateway and try to use this created cblite database in other clients, it simply does not work. However if you create a localdatabase at client side and create a bucket at server then push data from cblite to bucket via sync gateway, and finally use that cblite file in other clients, everything works well.

If you push from client to server, you can use that couchbase lite file in other clients
If you create cblite database by restored bucket pulling via sync gateway, you can not use it in other clients.

" If you pull documents from a restored bucket via sync gateway "

Unless you are using SG 1.5 / CB 5.0 , you will not be able to automatically import the documents from the bucket via the SGW . You will probably have to do something like bucket shadowing (which will be deprecated going forward).
In order for the files to be able to be replicated to the mobile clients, there needs to be sync metadata added to the documents. The sync metadata gets added only if documents get processed by the SGW.In the case when documents are pushed up from mobile clients, it goes through the SGW, so this sync metadata gets added to the documents and everything is good.
In the case when you are pulling documents from a restored bucket, I am not sure how it’s getting imported through the sync gateway - because if the documents are not processed for sync metadata, then it won’t get replicated to the clients. That should explain the difference in behavior that you are observing.

@priya.rajagopal thank you very much for your informative explanation. So what are my options except shadowing which I already know it is not recommended.

I was planning to write a jenkins job that will create a test stage from scratch every time it gets triggered. It would reset the bucket. Android test devices would remove their local databases. Every time pushing 1.5 million records from a client to bucket from scratch does not make sense at all which is what I do for major testing issues right now. Do you have any suggestions about achieving the problem in this point?

Assuming you want all the 1.5 mil records in bucket to be synched to the clients, instead of restoring the bucket with the documents and then attempting to sync them to clients, why don’t you just use the SG REST bulk_docs API to post the 1.5 million documents into the bucket via the SG. That way, they will be processed by GW for syncing and available for mobile clients as well. But they would still need to be synched to the Android client (although you could try and use the pre-built DB to avoid this).

If you have the flexibility to use SG 1.5 (now in Beta) and Couchbase Server 5.0 (also in beta), that would make this process a whole lot easier.

But they would still need to be synched to the Android client (although you could try and use the pre-built DB to avoid this).

this is already what I exactly suffer. It should have been run without the need to be synched again because I simply pull all the documents via SG and what I cant understand is how/why SG does not process the documents for metadata while pulling all the documents through it.