A journey to Flutter liveness (pt1)
Jodamco
Posted on June 18, 2024
Here we are again. This time I decided to write the posts as I go with the project, so it may or may not have an end, for sure it'll not have an order!
Google Machine Learning Kit
I was trying to decide on some Flutter side project to exercise some organizations and concepts from the framework and since AI is at hype I did some research and found out about Google Machine Learning kit which is a set of machine learning tools for different tasks such as face detection, text recognition, document digitalization, among other features (you should really check the link above). They're kinda plug and play, one can just install the plugin dependency and use the capabilities and it doesn't depend on API integrations or third-party accounts, so I decided to move on using it.
For the project itself, I decided to go with liveness - oh boy, if I had done some more research before maybe I would've selected something else - because I got curious about how current tools differentiate between photographs and real people. I have to be honest and say that I didn't do a deep research on the matter and I'll follow the path of reproducing the results I found in this great article. In it the author gets to the conclusion that
the usage of the GMLKit for liveness is feasible and my first goal is to reproduce the Euler Angles graphs, but using a Flutter app. I'm not sure what may be a final casual use for liveness, but I'm sure I'll learn through the process, so let's start!
Flutter app
The startup of a project is always a good thing. You know, follow up the docs for the init, run a flutter create my_app
or do it using vscode through the command bar. I'll be using FVM to manage the Flutter version and you can check out the full code here.
Camera Layer
First things first, I needed the camera preview set to get the image data (and to see something at least). For that, I added camera
and permission_handler
as dependencies to get access to the camera widgets. I also tried to split my camera component in a way that it would become agnostic regarding the machine learning layer, so I could reuse it in different contexts. Here's a small part of the camera widget
class CustomCameraPreview extends StatefulWidget {
final Function(ImageData inputImage)? onImage;
final CustomPaint? customPaint;
final VoidCallback? onCameraFeedReady;
const CustomCameraPreview({
super.key,
this.onImage,
this.onCameraFeedReady,
this.customPaint,
});
@override
State<CustomCameraPreview> createState() => _CustomCameraPreviewState();
}
class _CustomCameraPreviewState extends State<CustomCameraPreview> {
//... more code
Future<void> _startLiveFeed() async {
if (selectedCamera == null) {
setState(() {
hasError = true;
});
return;
}
_controller = CameraController(
selectedCamera!,
ResolutionPreset.high,
enableAudio: false,
imageFormatGroup: Platform.isAndroid
? ImageFormatGroup.nv21
: ImageFormatGroup.bgra8888,
);
await _controller?.initialize();
_controller?.startImageStream(_onImage);
if (widget.onCameraFeedReady != null) {
widget.onCameraFeedReady!();
}
}
//... more code
Widget display() {
if (isLoading) {
return PreviewPlaceholder.loadingPreview();
} else if (hasError) {
return PreviewPlaceholder.previewError(
onRetry: _initialize,
);
} else if (!hasPermissions) {
return PreviewPlaceholder.noPermission(
onAskForPermissions: _initialize,
);
} else {
return Stack(
fit: StackFit.expand,
children: <Widget>[
Center(
child: CameraPreview(
_controller!,
child: widget.customPaint,
),
),
],
);
}
}
@override
Widget build(BuildContext context) {
return Scaffold(
body: display(),
);
}
}
I think the most important part is the startup of the camera live feed. When creating the camera controller you must set the image type using the imageFormatGroup
property since this is required for the mlkit plugin to work. The ones on the code above are the ones recommended for each platform and you can check it better on the docs of the face detection plugin. This widget was inspired on the example widget from the official example from the docs.
One great thing I was able to test out was the usage of factories on widgets when I wrote the placeholder for the camera. There were other options, I was suggested to use widget extension and enums, but in the end, I was satisfied with the factory and decided to let it be since it simplified the way the parent was calling the placeholder.
enum PreviewType { permission, loading, error }
class PreviewPlaceholder extends StatelessWidget {
final PreviewType type;
final VoidCallback? onAction;
const PreviewPlaceholder._({
required this.type,
this.onAction,
});
factory PreviewPlaceholder.noPermission({
required VoidCallback onAskForPermissions,
}) =>
PreviewPlaceholder._(
type: PreviewType.permission,
onAction: onAskForPermissions,
);
factory PreviewPlaceholder.loadingPreview() => const PreviewPlaceholder._(
type: PreviewType.loading,
);
factory PreviewPlaceholder.previewError({required VoidCallback onRetry}) =>
PreviewPlaceholder._(
type: PreviewType.error,
onAction: onRetry,
);
@override
Widget build(BuildContext context) {
return Column(
mainAxisAlignment: MainAxisAlignment.center,
children: [
if (type == PreviewType.permission)
ElevatedButton(
onPressed: onAction,
child: const Text("Ask for camera permisions"),
),
if (type == PreviewType.error) ...[
const Text("Couldn't load camera preview"),
ElevatedButton(
onPressed: onAction,
child: const Text("Ask for camera permisions"),
),
],
if (type == PreviewType.loading) ...const [
Text("Loading preview"),
Center(
child: LinearProgressIndicator(),
)
],
],
);
}
}
With the camera layer done, let's dive into face detection.
Face detection
For the face detection, so far, I just needed to add two more dependencies: google_mlkit_commons
and google_mlkit_face_detection
. The docs of the GMLKit recommend using the specific plugin dependency for release builds instead of the Flutter GMLKit dependency.
If you copy write your first ever approach to face detection, it can be very straightforward to reach data and have the results, unless by one problem: if you're using android and android-camerax plugin, you will not be able use the camera image with face detection. This is because although you must've set ImageFormatGroup.nv21
as the output format, the current version of the flutter android-camerax plugin will only provide images using the yuv_420_888
format (you may find more info here). The good part is that someone provided a solution (community always rocks 🚀).
I set the detection widget as my main "layer" for the detection since it does the heavy job of running the face detection from the GMLKit plugin. It ended up being a very small widget with a core function for face detection
Future<void> _processImage(ImageData imageData) async {
if (_isBusy) return;
_isBusy = true;
RootIsolateToken rootIsolateToken = RootIsolateToken.instance!;
final analyticData = await Isolate.run<Map<String, dynamic>>(() async {
BackgroundIsolateBinaryMessenger.ensureInitialized(rootIsolateToken);
final inputImage = imageData.inputImageFromCameraImage(imageData);
if (inputImage == null) return {"faces": null, "image": inputImage};
final FaceDetector faceDetector = FaceDetector(
options: FaceDetectorOptions(
enableContours: true,
enableLandmarks: true,
),
);
final faces = await faceDetector.processImage(inputImage);
await faceDetector.close();
return {"faces": faces, "image": inputImage};
});
_isBusy = false;
}
A few comments on this function:
- It is VERY SIMPLE to get the data from the GMLKit and it can be done without the Isolate.
- Although the isolate is not needed you might want to use it since Flutter code should build in 16ms. I was eager to try out Isolates and never had a real good reason, but without it the processing of the image would drop the framerate and the app look would become terrible. By applying the Isolate I can remove all the processing and conversion from the main event loop and guarantee that the frames will be built on time.
- I decided to have the face detector instantiated inside the Isolate since I had trouble passing it out from the main isolate to the new one. I also had this specific conversion
imageData.inputImageFromCameraImage(imageData)
done inside the isolate since it is also time-consuming. This is what allows me to parse theyuv_420_888
format into the one needed for the GMLKit plugin. For this job, I decided that the best approach was to use a class to receive all the data from the camera and smoothly provide theInputImage
object for the GMLKit. You can check out the class here and the extension for the conversion here.
Results
So far I still don't have the Euler Angles in a graph as I wanted but I was able to at least get the data from the kit and paint out the bounding box of my face. I also did some tests regarding the execution time of the face detection and could see that the average time to execute the detection for a high-quality image is about 600ms with a debug build and about 380ms with a release build. Since I have the Isolate the framerate of the app is running ok but I would like to enhance this performance later.
My next step will be to get the Euler Angles and paint a graph with them so I can try to reproduce the comparison between photos and real people.
See you there!
Posted on June 18, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.