Analyzing Ceph Bluestore Initialization

Nov 25, 2020

This post explains how Ceph OSD (Object Store Device) daemon initializes BlueStore. Heavily referenced a blog post from ProgrammerSought 1. Analyzed on Ceph v16.0.0.

1. Ceph OSD main()

When a Ceph OSD launches, it determines the type of object store and initialize the corresponding file system.

src/ceph_osd.cc main()

bool mkfs = false;
...
if (ceph_argparse_flag(args, i, "--mkfs", (char*)NULL)) {
  mkfs = true;
}
...

std::string data_path = g_conf().get_val<std::string>("osd_data");
...

// the store
std::string store_type;
{
  char fn[PATH_MAX];
  snprintf(fn, sizeof(fn), "%s/type", data_path.c_str());
  int fd = ::open(fn, O_RDONLY|O_CLOEXEC);
  if (fd >= 0) {
    bufferlist bl;
    bl.read_fd(fd, 64);
    if (bl.length()) {
      store_type = string(bl.c_str(), bl.length() - 1);  // drop \n
      dout(5) << "object store type is " << store_type << dendl;
    }
    ::close(fd);
  } else if (mkfs) {
    store_type = g_conf().get_val<std::string>("osd_objectstore");
  } else {
    // hrm, infer the type
    snprintf(fn, sizeof(fn), "%s/current", data_path.c_str());
    struct stat st;
    if (::stat(fn, &st) == 0 && S_ISDIR(st.st_mode)) {
      derr << "missing 'type' file, inferring filestore from current/ dir"
          << dendl;
      store_type = "filestore";
    } else {
      snprintf(fn, sizeof(fn), "%s/block", data_path.c_str());
      if (::stat(fn, &st) == 0 && S_ISLNK(st.st_mode)) {
        derr << "missing 'type' file, inferring bluestore from block symlink"
            << dendl;
        store_type = "bluestore";
      } else {
        derr << "missing 'type' file and unable to infer osd type" << dendl;
        forker.exit(1);
      }
    }
  }
}

std::string journal_path = g_conf().get_val<std::string>("osd_journal");
uint32_t flags = g_conf().get_val<uint64_t>("osd_os_flags");
ObjectStore *store = ObjectStore::create(g_ceph_context,
                                         store_type,
                                         data_path,
                                         journal_path,
                                         flags);
...

OSD *osdptr = nullptr;
osdptr = new OSD(g_ceph_context,
      store,
      whoami,
      ms_cluster,
      ms_public,
      ms_hb_front_client,
      ms_hb_back_client,
      ms_hb_front_server,
      ms_hb_back_server,
      ms_objecter,
      &mc,
      data_path,
      journal_path,
      poolctx);
osdptr->pre_init();
osdptr->init();
osdptr->final_init();
...

g_conf() gets a configuration value from Ceph configuration file. By default, its location is /etc/ceph/ceph.conf, however, it can be overridden by -c of --config argument given to the Ceph OSD daemon. Currently cephadm launches daemons in containers, the file is located in the container.

$ podman exec -it ceph-<fsid>-osd.0 cat /etc/ceph/ceph.conf
# minimal ceph.conf for <fsid>
[global]
        fsid = <fsid>
        mon_host = [v2:<mon_ip>:3300/0,v1:<mon_ip>:6789/0]

Assuming store_type == "bluestore", ObjectStore::create() creates a new BlueStore class instance and returns it for the variable store.

src/os/ObjectStore.cc ObjectStore::create()

ObjectStore *ObjectStore::create(CephContext *cct,
				 const string& type,
				 const string& data,
				 const string& journal,
				 osflagbits_t flags) {
  ...
  #if defined(WITH_BLUESTORE)
  if (type == "bluestore") {
    return new BlueStore(cct, data);  /* src/os/BlueStore.cc */
  }
  #endif
  ...
}

The class BlueStore is defined in src/os/bluestore/BlueStore.h (don’t understand a virtual keyword is not used for ~BlueStore() destructor though):

src/os/bluestore/BlueStore.h

class BlueStore : public ObjectStore, public md_config_obs_t {
  BlueStore(CephContext *cct, const std::string& path);
  BlueStore(CephContext *cct, const std::string& path, uint64_t min_alloc_size);
  ~BlueStore() override;
  ...
};

This BlueStore instance is stored in store private variable of the OSD class. src/osd/OSD.h

class OSD : public Dispatcher, public md_config_obs_t {
  ...
  ObjectStore *store;
}

OSD::OSD(CephContext* cct_,
         ObjectStore* store_,
         ...)
  : ...,
    store(store_),
    ... 

2. OSD init()

Actual initialization operations are done in OSD::init():

src/osd/OSD.cc:

int OSD::init() {
  ...
  string val;
  store->read_meta("require_osd_release", &val);
  last_require_osd_release = ceph_release_from_name(val);

  // mount.
  store->mount();
  ...

  // read superblock
  read_superblock();

  // load up "current" osdmap
  get_osdmap();
  osdmap = get_map(superblock.current_epoch);
  set_osdmap(osdmap);
  ...

  check_osdmap_features();

  clear_temp_objects();

  // load up pgs (as they previously existed)
  load_pgs();

  dout(2) << "superblock: I am osd." << superblock.whoami << dendl;
}

3. Mount Bluestore Device

src/os/bluestore/BlueStore.h

class BlueStore : public ObjectStore, public md_config_obs_t {
  int mount() override { return _mount(); }
  ...
};

src/os/bluestore/BlueStore.cc

int BlueStore::mount() {
  if (cct->_conf->bluestore_fsck_on_mount) {
    fsck(cct->_conf->bluestore_fsck_on_mount_deep);
  }
  ...

  _open_db_and_around(false); // Open RocksDB by calling BlueStore::_open_db()
  _upgrade_super();
  _open_collections();
  _reload_logger();
  _kv_start();
  _dferred_replay();
  ...

  mounted = true;
  return 0;
}

  1. Bluestore–bluefs initialization part of the source code analysis ↩︎

comments powered by Disqus