The NMPA published the “Key Points of Ultrasound Imaging Artificial Intelligence Software (Process Optimization)” on July 10, 2023, to direct related manufacturers for type testing, clinical evaluation, and regulatory submission.
For an English copy of the “Key Points of Ultrasound Imaging Artificial Intelligence Software (Process Optimization)”, please email info@ChinaMedDevice.com. We charge nominal fees for the translation.
Applicable Scope
The document is applicable to AI software for process optimization in ultrasound imaging products. The products should be used by doctors with ultrasound diagnosis qualifications. Usually integrated in ultrasound imaging equipment, they are Class II or Class III devices:
- If it is integrated in independent software as process optimization function, it shall be Class II.
- If it is functioned as assisted diagnosis software, it shall be Class III. Due to lack of product approval and review experience, the diagnostic ultrasound AI function is not included in the review points.
Highlights of the Key Points
Data Collection
Clarify the model, probe, imaging parameters, and acquisition time (applicable to image sequences) of the collected data samples, as well as the source institution, the situation of the personnel collecting, and the collection time, and compare the differences with the algorithm training data requirements.
The verification of the process optimization function needs to simulate the usage process as much as possible and may need to acquire a longer scan sequence than the AI diagnostic image or sequence, not just a single frame image for training the core algorithm.
Data Annotation
The basic information of marking, reviewing, and arbitrating personnel should be given in a list, such as department, working years, institution, training and assessment situation, and the amount of annotation. If there are foreign personnel, their qualification requirements must be clarified.
Labeling rules should clarify the name and specific content, diagnosis and treatment guidelines, and explain its authority, acceptance and usage in China. If there is any dispute, or if doctors need to judge based on experience, their impact on labeling consistency should be analyzed.
A flowchart should be used to introduce the labeling, review, and arbitration process of a single image/single sequence. If the labeling is more complicated, typical pictures/sequences plus flow charts should be used to explain the labeling content of each step and the handling of special cases. if automatic labeling is used, it should be described in detail whether the image sequence is clearly labeled frame.
For the labeling of region segmentation, multiple people’s labeling is bound to be inconsistent, and the picture should be used to illustrate how to integrate the labeling results of multiple people and its impact. Briefly describe the management process and time period of all image annotations, especially the quality assessment during the annotation cycle. If the data sets and data volumes used for labeling of different functions are different, the specific situation should be clarified.
Dataset construction
For the basic database before labeling, the training set, tuning set, and test set divided after labeling, the sample size and distribution and their determination basis should be given, as well as the method and basis of set division.
Considering the level of “patient-structure-section-image/image sequence” (that is, a certain structure of a person, image, or image sequence with different sections), the three data sets should be mutually exclusive at the patient level, at least until structural levels are pairwise disjoint.
If the range of non-intersections is reduced, the explanation should be given. Provide the verification results of the duplicate check to verify that the samples of the training set, tuning set, and test set so that theses do not overlap each other.
The distribution of samples should consider models, probes, imaging parameters, examination methods, scanning angles/sections/depths, poor imaging quality or insufficient range, epidemiological distribution, expected application scenarios, confusing images, affecting physical conditions or Factors such as disease, implants, congenital or acquired abnormalities in body structure, etc.
Data amplification can be conducted based on the “Guideline for Artificial Intelligence Medical Devices”. It shall focus on explaining the method and implementation of the amplification, analyzing the similarity between the amplified sample and the real sample, and analyzing the impact on the algorithm. For those using generative confrontation network (GAN) data augmentation, basic information of algorithms and basis for algorithm selection shall be provided.
If it is necessary to collect sequence images of the scanning process for algorithm verification, the characteristics, sample size, and sample distribution of the sequence images should be explained.
Outline of the Key Points
1.Product Technical Requirements and Testing Reports
1.1 Specification information
1.2 Performance indicators
2. Software Research
3. Algorithm Research
3.1 Algorithm basic information
3.1.1 Functional description
3.1.2 Algorithm description
3.2 Algorithmic risk management
3.3 Algorithm requirements specification
3.4 Data quality control
3.4.1 Data collection
3.4.2 Data collation
3.4.3 Data labeling
3.4.4 Dataset construction
3.5 Algorithm training
3.6 Algorithm verification and confirmation
3.6.1 Algorithm performance evaluation
3.6.2 Evaluation of algorithm performance influencing factors
3.6.3 Comprehensive evaluation of algorithm performance
4. User Training