Semi-supervised Learning Subset Selection Data Loaders
In this section, we consider different subset selection based data loaders geared towards efficient and robust learning in standard semi-supervised learning setting.
DSS Dataloader (Base Class)
- class cords.utils.data.dataloader.SSL.dssdataloader.DSSDataLoader(full_data, dss_args, logger, *args, **kwargs)[source]
Bases:
object
Implementation of DSSDataLoader class which serves as base class for dataloaders of other selection strategies for semi-supervised learning framework.
- Parameters
full_data (torch.utils.data.Dataset Class) – Full dataset from which data subset needs to be selected.
dss_args (dict) – Data subset selection arguments dictionary
logger (class) – Logger class for logging the information
Non-Adaptive subset selection Data Loaders
- class cords.utils.data.dataloader.SSL.nonadaptive.nonadaptivedataloader.NonAdaptiveDSSDataLoader(train_loader, val_loader, dss_args, logger, *args, **kwargs)[source]
Bases:
cords.utils.data.dataloader.SSL.dssdataloader.DSSDataLoader
Implementation of NonAdaptiveDSSDataLoader class which serves as base class for dataloaders of other nonadaptive subset selection strategies for semi-supervised learning setting.
- Parameters
train_loader (torch.utils.data.DataLoader class) – Dataloader of the training dataset
val_loader (torch.utils.data.DataLoader class) – Dataloader of the validation dataset
dss_args (dict) – Data subset selection arguments dictionary
logger (class) – Logger for logging the information
- class cords.utils.data.dataloader.SSL.nonadaptive.craigdataloader.CRAIGDataLoader(train_loader, val_loader, dss_args, logger, *args, **kwargs)[source]
Bases:
cords.utils.data.dataloader.SSL.nonadaptive.nonadaptivedataloader.NonAdaptiveDSSDataLoader
Implements of CRAIGDataLoader that serves as the dataloader for the nonadaptive CRAIG subset selection strategy for semi-supervised learning and is an adapted version from the paper 1.
- Parameters
train_loader (torch.utils.data.DataLoader class) – Dataloader of the training dataset
val_loader (torch.utils.data.DataLoader class) – Dataloader of the validation dataset
dss_args (dict) – Data subset selection arguments dictionary required for CRAIG subset selection strategy
logger (class) – Logger for logging the information
- class cords.utils.data.dataloader.SSL.nonadaptive.submoddataloader.FacLocDataLoader(train_loader, val_loader, dss_args, logger, *args, **kwargs)[source]
Bases:
cords.utils.data.dataloader.SSL.nonadaptive.submoddataloader.SubmodDataLoader
Implementation of FacLocDataLoader class for the nonadaptive facility location based subset selection strategy for semi-supervised learning setting.
- Parameters
train_loader (torch.utils.data.DataLoader class) – Dataloader of the training dataset
val_loader (torch.utils.data.DataLoader class) – Dataloader of the validation dataset
dss_args (dict) – Data subset selection arguments dictionary
logger (class) – Logger for logging the information
- class cords.utils.data.dataloader.SSL.nonadaptive.submoddataloader.GraphCutDataLoader(train_loader, val_loader, dss_args, logger, *args, **kwargs)[source]
Bases:
cords.utils.data.dataloader.SSL.nonadaptive.submoddataloader.SubmodDataLoader
Implementation of GraphCutDataLoader class for the nonadaptive graph cut function based subset selection strategy for semi-supervised learning setting.
- Parameters
train_loader (torch.utils.data.DataLoader class) – Dataloader of the training dataset
val_loader (torch.utils.data.DataLoader class) – Dataloader of the validation dataset
dss_args (dict) – Data subset selection arguments dictionary
logger (class) – Logger for logging the information
- class cords.utils.data.dataloader.SSL.nonadaptive.submoddataloader.SaturatedCoverageDataLoader(train_loader, val_loader, dss_args, logger, *args, **kwargs)[source]
Bases:
cords.utils.data.dataloader.SSL.nonadaptive.submoddataloader.SubmodDataLoader
Implementation of SaturatedCoverageDataLoader class for the nonadaptive saturated coverage function based subset selection strategy for semi-supervised learning setting.
- Parameters
train_loader (torch.utils.data.DataLoader class) – Dataloader of the training dataset
val_loader (torch.utils.data.DataLoader class) – Dataloader of the validation dataset
dss_args (dict) – Data subset selection arguments dictionary
logger (class) – Logger for logging the information
- class cords.utils.data.dataloader.SSL.nonadaptive.submoddataloader.SubmodDataLoader(train_loader, val_loader, dss_args, logger, *args, **kwargs)[source]
Bases:
cords.utils.data.dataloader.SSL.nonadaptive.nonadaptivedataloader.NonAdaptiveDSSDataLoader
Implementation of SubmodDataLoader class for the nonadaptive submodular subset selection strategies for semi-supervised learning setting.
- Parameters
train_loader (torch.utils.data.DataLoader class) – Dataloader of the training dataset
val_loader (torch.utils.data.DataLoader class) – Dataloader of the validation dataset
dss_args (dict) – Data subset selection arguments dictionary
logger (class) – Logger for logging the information
- class cords.utils.data.dataloader.SSL.nonadaptive.submoddataloader.SumRedundancyDataLoader(train_loader, val_loader, dss_args, logger, *args, **kwargs)[source]
Bases:
cords.utils.data.dataloader.SSL.nonadaptive.submoddataloader.SubmodDataLoader
Implementation of SumRedundancyDataLoader class for the nonadaptive sum redundancy function based subset selection strategy for semi-supervised learning setting.
- Parameters
train_loader (torch.utils.data.DataLoader class) – Dataloader of the training dataset
val_loader (torch.utils.data.DataLoader class) – Dataloader of the validation dataset
dss_args (dict) – Data subset selection arguments dictionary
logger (class) – Logger for logging the information
Adaptive subset selection Data Loaders
- class cords.utils.data.dataloader.SSL.adaptive.adaptivedataloader.AdaptiveDSSDataLoader(train_loader, val_loader, dss_args, logger, *args, **kwargs)[source]
Bases:
cords.utils.data.dataloader.SSL.dssdataloader.DSSDataLoader
Implementation of AdaptiveDSSDataLoader class which serves as base class for dataloaders of other adaptive subset selection strategies for semi-supervised learning framework.
- Parameters
train_loader (torch.utils.data.DataLoader class) – Dataloader of the training dataset
val_loader (torch.utils.data.DataLoader class) – Dataloader of the validation dataset
dss_args (dict) – Data subset selection arguments dictionary
logger (class) – Logger for logging the information
- class cords.utils.data.dataloader.SSL.adaptive.retrievedataloader.RETRIEVEDataLoader(train_loader, val_loader, dss_args, logger, *args, **kwargs)[source]
Bases:
cords.utils.data.dataloader.SSL.adaptive.adaptivedataloader.AdaptiveDSSDataLoader
Implements of RETRIEVEDataLoader that serves as the dataloader for the adaptive RETRIEVE subset selection strategy from the paper 2.
- Parameters
train_loader (torch.utils.data.DataLoader class) – Dataloader of the training dataset
val_loader (torch.utils.data.DataLoader class) – Dataloader of the validation dataset
dss_args (dict) – Data subset selection arguments dictionary required for GLISTER subset selection strategy
logger (class) – Logger for logging the information
- class cords.utils.data.dataloader.SSL.adaptive.craigdataloader.CRAIGDataLoader(train_loader, val_loader, dss_args, logger, *args, **kwargs)[source]
Bases:
cords.utils.data.dataloader.SSL.adaptive.adaptivedataloader.AdaptiveDSSDataLoader
Implements of CRAIGDataLoader that serves as the dataloader for the adaptive CRAIG subset selection strategy for semi-supervised learning and is an adapted version from the paper 1.
- Parameters
train_loader (torch.utils.data.DataLoader class) – Dataloader of the training dataset
val_loader (torch.utils.data.DataLoader class) – Dataloader of the validation dataset
dss_args (dict) – Data subset selection arguments dictionary required for CRAIG subset selection strategy
logger (class) – Logger for logging the information
- class cords.utils.data.dataloader.SSL.adaptive.gradmatchdataloader.GradMatchDataLoader(train_loader, val_loader, dss_args, logger, *args, **kwargs)[source]
Bases:
cords.utils.data.dataloader.SSL.adaptive.adaptivedataloader.AdaptiveDSSDataLoader
Implements of GradMatchDataLoader that serves as the dataloader for the adaptive GradMatch subset selection strategy for semi-supervised learning and is an adapted version of the one given in the paper 3. :param train_loader: Dataloader of the training dataset :type train_loader: torch.utils.data.DataLoader class :param val_loader: Dataloader of the validation dataset :type val_loader: torch.utils.data.DataLoader class :param dss_args: Data subset selection arguments dictionary required for GradMatch subset selection strategy :type dss_args: dict :param logger: Logger for logging the information :type logger: class
- class cords.utils.data.dataloader.SSL.adaptive.randomdataloader.RandomDataLoader(train_loader, dss_args, logger, *args, **kwargs)[source]
Bases:
cords.utils.data.dataloader.SSL.adaptive.adaptivedataloader.AdaptiveDSSDataLoader
Implements of RandomDataLoader that serves as the dataloader for the non-adaptive Random subset selection strategy.
- Parameters
train_loader (torch.utils.data.DataLoader class) – Dataloader of the training dataset
dss_args (dict) – Data subset selection arguments dictionary required for Random subset selection strategy
logger (class) – Logger for logging the information
- class cords.utils.data.dataloader.SSL.adaptive.olrandomdataloader.OLRandomDataLoader(train_loader, dss_args, logger, *args, **kwargs)[source]
Bases:
cords.utils.data.dataloader.SSL.adaptive.adaptivedataloader.AdaptiveDSSDataLoader
Implements of OLRandomDataLoader that serves as the dataloader for the adaptive Random subset selection strategy.
- Parameters
train_loader (torch.utils.data.DataLoader class) – Dataloader of the training dataset
dss_args (dict) – Data subset selection arguments dictionary required for Random subset selection strategy
logger (class) – Logger for logging the information
REFERENCES
- 1(1,2)
Baharan Mirzasoleiman, Jeff Bilmes, and Jure Leskovec. Coresets for data-efficient training of machine learning models. In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, 6950–6960. PMLR, 13–18 Jul 2020. URL: https://proceedings.mlr.press/v119/mirzasoleiman20a.html.
- 2
Krishnateja Killamsetty, Xujiang Zhao, Feng Chen, and Rishabh K Iyer. RETRIEVE: coreset selection for efficient and robust semi-supervised learning. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems. 2021. URL: https://openreview.net/forum?id=jSz59N8NvUP.
- 3
Krishnateja Killamsetty, Durga S, Ganesh Ramakrishnan, Abir De, and Rishabh Iyer. Grad-match: gradient matching based data subset selection for efficient deep model training. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, 5464–5474. PMLR, 18–24 Jul 2021. URL: https://proceedings.mlr.press/v139/killamsetty21a.html.