Skip to content

cold start handling in ranked batch sampling #28

@zhangyu94

Description

@zhangyu94

Hi!

The behavior of cold start handling in ranked batch sampling seems different from the Cardoso et al.'s "Ranked batch-mode active learning".

modAL/modAL/batch.py

Lines 133 to 139 in 452898f

if classifier.X_training is None:
labeled = select_cold_start_instance(X=unlabeled, metric=metric, n_jobs=n_jobs)
elif classifier.X_training.shape[0] > 0:
labeled = classifier.X_training[:]
# Define our record container and the maximum number of records to sample.
instance_index_ranking = []

In modAL's implementation, in the case of cold start, the instance selected by select_cold_start_instance is not added to the instance list instance_index_ranking.
While in "Ranked batch-mode active learning", the instance selected by select_cold_start_instance seems to be the first item in instance_index_ranking.

return X[best_coldstart_instance_index].reshape(1, -1)

If my understanding on the algorithm proposed in the paper and modAL's implementation is correct, we can change the return of select_cold_start_instance to
return best_coldstart_instance_index, X[best_coldstart_instance_index].reshape(1, -1),
store best_coldstart_instance_index in instance_index_ranking, and revise ranked_batch correspondingly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions